mirror of
https://github.com/NaiboWang/EasySpider.git
synced 2025-04-19 03:39:42 +08:00
Compare commits
191 Commits
Author | SHA1 | Date | |
---|---|---|---|
![]() |
fc5aa8368b | ||
![]() |
793f028a00 | ||
![]() |
ae22977143 | ||
![]() |
541b3c13d2 | ||
![]() |
a6192b730c | ||
![]() |
d39218f5fd | ||
![]() |
a94c45b36d | ||
![]() |
0e8aba6b51 | ||
![]() |
e42ad07d80 | ||
![]() |
2f6344d00b | ||
![]() |
bfa6c0de76 | ||
![]() |
b590cc22c5 | ||
![]() |
d69adacbd1 | ||
![]() |
15654da7eb | ||
![]() |
967f5b8033 | ||
![]() |
aa419ee845 | ||
![]() |
f005e48700 | ||
![]() |
4e96ed7d50 | ||
![]() |
e3fecc8926 | ||
![]() |
119cb99711 | ||
![]() |
f43bdd236d | ||
![]() |
56f0847500 | ||
![]() |
0df6cebd18 | ||
![]() |
4b42f6300c | ||
![]() |
2cf33794f1 | ||
![]() |
9efd3b6efe | ||
![]() |
ad956be10d | ||
![]() |
01de17d471 | ||
![]() |
333dcd3ff4 | ||
![]() |
555f02815c | ||
![]() |
34ed41110a | ||
![]() |
32459b622d | ||
![]() |
02cd8599b0 | ||
![]() |
2feede55db | ||
![]() |
33dda444d7 | ||
![]() |
d7ccb22d01 | ||
![]() |
f7a842eed6 | ||
![]() |
ea6fb049f5 | ||
![]() |
5216ffba82 | ||
![]() |
4f0851e361 | ||
![]() |
7bb9d5a374 | ||
![]() |
c56e87120d | ||
![]() |
5180f47b70 | ||
![]() |
b4d7ddf5cb | ||
![]() |
2031b09297 | ||
![]() |
cc9a8082da | ||
![]() |
3daf5e8c21 | ||
![]() |
8f5d7a3a52 | ||
![]() |
ee4a077630 | ||
![]() |
3fe6f42366 | ||
![]() |
eb3b578745 | ||
![]() |
4ca5333f8b | ||
![]() |
b50d4eae3f | ||
![]() |
998a1ddb19 | ||
![]() |
07563bc750 | ||
![]() |
7b5ccf4a78 | ||
![]() |
209235de8d | ||
![]() |
72529c0675 | ||
![]() |
081c49357e | ||
![]() |
b611ddb6cd | ||
![]() |
abfac8c342 | ||
![]() |
951a39fff6 | ||
![]() |
6d3d10f7a7 | ||
![]() |
46b1959564 | ||
![]() |
e14896d7cd | ||
![]() |
450dfa1a77 | ||
![]() |
3b907ba382 | ||
![]() |
70dd90470f | ||
![]() |
cc8bb70715 | ||
![]() |
c5f1696f11 | ||
![]() |
b987408fc2 | ||
![]() |
391f0ea99d | ||
![]() |
a94b67a1f6 | ||
![]() |
54ef89aef7 | ||
![]() |
22a3b45f13 | ||
![]() |
44bfb69a36 | ||
![]() |
5c1207649d | ||
![]() |
c967db3dac | ||
![]() |
baec9c4298 | ||
![]() |
3e7abd6273 | ||
![]() |
32df9d5060 | ||
![]() |
05c52f9dc8 | ||
![]() |
7c4dafc002 | ||
![]() |
2afaf43162 | ||
![]() |
b79d92df1d | ||
![]() |
e4e1a1b095 | ||
![]() |
048dfb1f4b | ||
![]() |
1750481744 | ||
![]() |
3ead5e7312 | ||
![]() |
81957adb52 | ||
![]() |
dbad074565 | ||
![]() |
8342135b36 | ||
![]() |
e74915d94c | ||
![]() |
df62f710e3 | ||
![]() |
118241ba6d | ||
![]() |
de47e8516a | ||
![]() |
d438e4b19d | ||
![]() |
0003041dab | ||
![]() |
ec3d9094bf | ||
![]() |
629509a588 | ||
![]() |
5e17563d11 | ||
![]() |
5acafe7948 | ||
![]() |
c25f80c175 | ||
![]() |
ab88b33c74 | ||
![]() |
7442e43be3 | ||
![]() |
a0518412b0 | ||
![]() |
9ccb56aeae | ||
![]() |
3601ddb14d | ||
![]() |
728a5cb3ea | ||
![]() |
46909e4866 | ||
![]() |
072b6ad21e | ||
![]() |
bf320abf1a | ||
![]() |
2d7c3c1323 | ||
![]() |
c185e914e7 | ||
![]() |
7c0ab0e519 | ||
![]() |
f50b08e9c4 | ||
![]() |
ff7d82f4d0 | ||
![]() |
944d968679 | ||
![]() |
9f1f152680 | ||
![]() |
18321e4fee | ||
![]() |
b79bda9001 | ||
![]() |
80bc210ff1 | ||
![]() |
dbf7681518 | ||
![]() |
f18616e3ff | ||
![]() |
911ea02f3f | ||
![]() |
22f86cf0f2 | ||
![]() |
0285246337 | ||
![]() |
4fdce9a915 | ||
![]() |
15aab7c0c5 | ||
![]() |
3ec64d2623 | ||
![]() |
5582205204 | ||
![]() |
c272e5da86 | ||
![]() |
52702d4eb3 | ||
![]() |
a8e77b5e15 | ||
![]() |
606de75577 | ||
![]() |
76fd4bad55 | ||
![]() |
2860bc7b8c | ||
![]() |
ebe8a56a6f | ||
![]() |
e086de2852 | ||
![]() |
c2d16e13c2 | ||
![]() |
e43318f57a | ||
![]() |
7849707486 | ||
![]() |
b1632459ef | ||
![]() |
a2bd496e8e | ||
![]() |
9ed61c4f50 | ||
![]() |
c8b71835de | ||
![]() |
0afa159c98 | ||
![]() |
3ba748b101 | ||
![]() |
818d3e0ddc | ||
![]() |
ad568af5f3 | ||
![]() |
b2a6fd6b6b | ||
![]() |
960cf74de1 | ||
![]() |
fce97dec61 | ||
![]() |
3ffd34d0fd | ||
![]() |
1b6661afb8 | ||
![]() |
3350c50600 | ||
![]() |
6b14afcf00 | ||
![]() |
73c9a3a647 | ||
![]() |
93a49d8c58 | ||
![]() |
5cf81ce5d5 | ||
![]() |
7d9ae708b2 | ||
![]() |
590c9907a4 | ||
![]() |
4c85fdbf5d | ||
![]() |
8f4bd8709c | ||
![]() |
38d329fe27 | ||
![]() |
10b0210983 | ||
![]() |
ea6f17477d | ||
![]() |
a43189d4cd | ||
![]() |
3c1b4a1019 | ||
![]() |
fe2a3ee87a | ||
![]() |
d1b7b247b8 | ||
![]() |
49241abf02 | ||
![]() |
cb47353da6 | ||
![]() |
a971b52d38 | ||
![]() |
a365783e41 | ||
![]() |
ab47ed2be0 | ||
![]() |
838616e131 | ||
![]() |
7d247d68ec | ||
![]() |
c5a4b11dfb | ||
![]() |
2a241010a9 | ||
![]() |
499c3f21b6 | ||
![]() |
4f858ffee1 | ||
![]() |
580af6faaa | ||
![]() |
4e53596680 | ||
![]() |
0ded0fb67c | ||
![]() |
66918e347c | ||
![]() |
cff8ae5b93 | ||
![]() |
8ea69c9d0f | ||
![]() |
0f5c6a89bf | ||
![]() |
c8d6017190 | ||
![]() |
def19ba4bf |
25
.github/ISSUE_TEMPLATE.md
vendored
Normal file
25
.github/ISSUE_TEMPLATE.md
vendored
Normal file
@ -0,0 +1,25 @@
|
||||
## 版本信息 | Version Information
|
||||
**EasySpider版本 | EasySpider Version**:
|
||||
**系统版本(架构) | System Version (Architecture)**:
|
||||
**浏览器版本 | Browser Version**:
|
||||
**安装方式 | Installation method**:
|
||||
|
||||
## 问题描述 | Issue Description
|
||||
|
||||
|
||||
## 如何复现 | Steps to Reproduce
|
||||
|
||||
## 示例任务文件 | Example Task File
|
||||
|
||||
Windows和Linux版本的软件设计的任务文件在软件目录下的`tasks`文件夹中,文件名为任务列表中`任务的ID号.json`;MacOS系统的任务文件目录请运行下面的命令打开tasks文件夹:
|
||||
|
||||
The task file designed for the Windows and Linux versions of the software is in the `tasks` folder in the software directory, and the file name is `the ID number of the task.json` in the task list; the task file directory of the MacOS system is opened by running the following command:
|
||||
|
||||
```bash
|
||||
cd /Users/$(whoami)/Library/Application\ Support/EasySpider/tasks
|
||||
open .
|
||||
```
|
||||
|
||||
请将任务文件直接以文件的方式粘贴到这里,不要截图和打开复制里面的内容。
|
||||
|
||||
Please paste the task file directly as a file here, do not take screenshots and open to copy the content.
|
4
.gitignore
vendored
4
.gitignore
vendored
@ -13,4 +13,6 @@ old_code/
|
||||
*.mp4
|
||||
*.tar.xz
|
||||
*.zip
|
||||
Data/
|
||||
Data/
|
||||
**/__pycache__/
|
||||
**/.venv/
|
4
.temp_to_pub/.gitignore
vendored
4
.temp_to_pub/.gitignore
vendored
@ -1,10 +1,10 @@
|
||||
EasySpider_MacOS/easyspider_executestage
|
||||
EasySpider_MacOS/easyspider_executestage_full
|
||||
EasySpider_Linux64_x64/user_data
|
||||
EasySpider_windows_x32/user_data
|
||||
EasySpider_Windows_x32/user_data
|
||||
EasySpider
|
||||
EasySpider.app/
|
||||
EasySpider_windows_x64/user_data
|
||||
EasySpider_Windows_x64/user_data
|
||||
*.tmp
|
||||
*.tar.gz
|
||||
*.7z*
|
||||
|
@ -5,9 +5,11 @@ import copy
|
||||
import platform
|
||||
import shutil
|
||||
import string
|
||||
import threading
|
||||
# import undetected_chromedriver as uc
|
||||
from utils import detect_optimizable, download_image, extract_text_from_html, get_output_code, isnotnull, lowercase_tags_in_xpath, myMySQL, new_line, \
|
||||
on_press_creator, on_release_creator, readCode, replace_field_values, send_email, split_text_by_lines, write_to_csv, write_to_excel, write_to_json
|
||||
on_press_creator, on_release_creator, readCode, rename_downloaded_file, replace_field_values, send_email, split_text_by_lines, write_to_csv, write_to_excel, write_to_json
|
||||
from constants import WriteMode, DataWriteMode, GraphOption
|
||||
from myChrome import MyChrome
|
||||
from threading import Thread, Event
|
||||
from PIL import Image
|
||||
@ -30,7 +32,6 @@ from selenium.webdriver.common.action_chains import ActionChains
|
||||
from selenium.webdriver.common.keys import Keys
|
||||
from selenium.webdriver.chrome.options import Options
|
||||
from selenium.webdriver.chrome.service import Service
|
||||
from pynput.keyboard import Key, Listener
|
||||
from datetime import datetime
|
||||
import io # 遇到错误退出时应执行的代码
|
||||
import json
|
||||
@ -75,10 +76,7 @@ class BrowserThread(Thread):
|
||||
def __init__(self, browser_t, id, service, version, event, saveName, config, option):
|
||||
Thread.__init__(self)
|
||||
self.logs = io.StringIO()
|
||||
try:
|
||||
self.log = bool(service["recordLog"])
|
||||
except:
|
||||
self.log = True
|
||||
self.log = bool(service.get("recordLog", True))
|
||||
self.browser = browser_t
|
||||
self.option = option
|
||||
self.config = config
|
||||
@ -86,22 +84,13 @@ class BrowserThread(Thread):
|
||||
self.totalSteps = 0
|
||||
self.id = id
|
||||
self.event = event
|
||||
try:
|
||||
self.saveName = service["saveName"] # 保存文件的名字
|
||||
except:
|
||||
now = datetime.now()
|
||||
# 将时间格式化为精确到秒的字符串
|
||||
self.saveName = now.strftime("%Y_%m_%d_%H_%M_%S")
|
||||
now = datetime.now()
|
||||
self.saveName = service.get("saveName", now.strftime("%Y_%m_%d_%H_%M_%S")) # 保存文件的名字
|
||||
self.OUTPUT = ""
|
||||
self.SAVED = False
|
||||
self.BREAK = False
|
||||
self.CONTINUE = False
|
||||
try:
|
||||
maximizeWindow = service["maximizeWindow"]
|
||||
except:
|
||||
maximizeWindow = 0
|
||||
if maximizeWindow == 1:
|
||||
self.browser.maximize_window()
|
||||
self.browser.maximize_window() if service.get("maximizeWindow") == 1 else ...
|
||||
# 名称设定
|
||||
if saveName != "": # 命令行覆盖保存名称
|
||||
self.saveName = saveName # 保存文件的名字
|
||||
@ -112,19 +101,23 @@ class BrowserThread(Thread):
|
||||
self.print_and_log("Save Name for task ID", id, "is:", self.saveName)
|
||||
if not os.path.exists("Data/Task_" + str(id)):
|
||||
os.mkdir("Data/Task_" + str(id))
|
||||
if not os.path.exists("Data/Task_" + str(id) + "/" + self.saveName):
|
||||
os.mkdir("Data/Task_" + str(id) + "/" +
|
||||
self.saveName) # 创建保存文件夹用来保存截图
|
||||
self.downloadFolder = "Data/Task_" + str(id) + "/" + self.saveName
|
||||
if not os.path.exists(self.downloadFolder):
|
||||
os.mkdir(self.downloadFolder) # 创建保存文件夹用来保存截图和文件
|
||||
if not os.path.exists(self.downloadFolder + "/files"):
|
||||
os.mkdir(self.downloadFolder + "/files")
|
||||
if not os.path.exists(self.downloadFolder + "/images"):
|
||||
os.mkdir(self.downloadFolder + "/images")
|
||||
self.getDataStep = 0
|
||||
self.startSteps = 0
|
||||
try:
|
||||
startFromExit = service["startFromExit"] # 从上次退出的步骤开始
|
||||
if startFromExit == 1:
|
||||
if service.get("startFromExit", 0) == 1:
|
||||
with open("Data/Task_" + str(self.id) + "/" + self.saveName + '_steps.txt', 'r',
|
||||
encoding='utf-8-sig') as file_obj:
|
||||
self.startSteps = int(file_obj.read()) # 读取已执行步数
|
||||
except:
|
||||
pass
|
||||
except Exception as e:
|
||||
self.print_and_log(f"读取steps.txt失败,原因:{str(e)}")
|
||||
|
||||
if self.startSteps != 0:
|
||||
self.print_and_log("此模式下,任务ID", self.id, "将从上次退出的步骤开始执行,之前已采集条数为",
|
||||
self.startSteps, "条。")
|
||||
@ -132,7 +125,7 @@ class BrowserThread(Thread):
|
||||
"will start from the last step, before we already collected", self.startSteps, " items.")
|
||||
else:
|
||||
self.print_and_log("此模式下,任务ID", self.id,
|
||||
"将从头F开始执行,如果需要从上次退出的步骤开始执行,请在保存任务时设置是否从上次保存位置开始执行为“是”。")
|
||||
"将从头开始执行,如果需要从上次退出的步骤开始执行,请在保存任务时设置是否从上次保存位置开始执行为“是”。")
|
||||
self.print_and_log("In this mode, task ID", self.id,
|
||||
"will start from the beginning, if you want to start from the last step, please set the option 'start from the last step' to 'yes' when saving the task.")
|
||||
stealth_path = driver_path[:driver_path.find(
|
||||
@ -140,78 +133,83 @@ class BrowserThread(Thread):
|
||||
with open(stealth_path, 'r') as f:
|
||||
js = f.read()
|
||||
self.print_and_log("Loading stealth.min.js")
|
||||
self.browser.execute_cdp_cmd('Page.addScriptToEvaluateOnNewDocument', {
|
||||
'source': js}) # TMALL 反扒
|
||||
self.browser.execute_cdp_cmd('Page.addScriptToEvaluateOnNewDocument', {'source': js}) # TMALL 反扒
|
||||
self.browser.execute_cdp_cmd("Page.addScriptToEvaluateOnNewDocument", {
|
||||
"source": """
|
||||
Object.defineProperty(navigator, 'webdriver', {
|
||||
get: () => undefined
|
||||
})
|
||||
"""
|
||||
})
|
||||
WebDriverWait(self.browser, 10)
|
||||
self.browser.command_executor._commands["send_command"] = ("POST", '/session/$sessionId/chromium/send_command')
|
||||
path = os.path.join(os.path.abspath("./"), "Data", "Task_" + str(self.id))
|
||||
path = os.path.join(os.path.abspath("./"), "Data", "Task_" + str(self.id), self.saveName, "files")
|
||||
self.paramss = {'cmd': 'Page.setDownloadBehavior', 'params': {'behavior': 'allow', 'downloadPath': path}}
|
||||
|
||||
self.browser.execute("send_command", self.paramss) # 下载地址改变
|
||||
self.browser.execute("send_command", self.paramss) # 下载目录改变
|
||||
self.monitor_event = threading.Event()
|
||||
self.monitor_thread = threading.Thread(target=rename_downloaded_file, args=(path, self.monitor_event)) #path后面的逗号不能省略,是元组固定写法
|
||||
self.monitor_thread.start()
|
||||
# self.browser.get('about:blank')
|
||||
self.procedure = service["graph"] # 程序执行流程
|
||||
try:
|
||||
self.maxViewLength = service["maxViewLength"] # 最大显示长度
|
||||
except:
|
||||
self.maxViewLength = 15
|
||||
try:
|
||||
self.outputFormat = service["outputFormat"] # 输出格式
|
||||
except:
|
||||
self.outputFormat = "csv"
|
||||
try:
|
||||
self.task_version = service["version"] # 任务版本
|
||||
if service["version"] >= "0.3.1": # 0.3.1及以上版本以上的EasySpider兼容从0.3.1版本开始的所有版本
|
||||
pass
|
||||
else: # 0.3.1以下版本的EasySpider不兼容0.3.1及以上版本的EasySpider
|
||||
if service["version"] != version:
|
||||
self.print_and_log("版本不一致,请使用" +
|
||||
service["version"] + "版本的EasySpider运行该任务!")
|
||||
self.print_and_log("Version not match, please use EasySpider " +
|
||||
service["version"] + " to run this task!")
|
||||
self.browser.quit()
|
||||
sys.exit()
|
||||
except: # 0.2.0版本没有version字段,所以直接退出
|
||||
self.maxViewLength = service.get("maxViewLength", 15) # 最大显示长度
|
||||
self.outputFormat = service.get("outputFormat", "csv") # 输出格式
|
||||
self.save_threshold = service.get("saveThreshold", 10) # 保存最低阈值
|
||||
self.dataWriteMode = service.get("dataWriteMode", DataWriteMode.Append.value) # 数据写入模式,1为追加,2为覆盖,3为重命名文件
|
||||
self.task_version = service.get("version", "") # 任务版本
|
||||
|
||||
if not self.task_version:
|
||||
self.print_and_log("版本不一致,请使用v0.2.0版本的EasySpider运行该任务!")
|
||||
self.print_and_log(
|
||||
"Version not match, please use EasySpider v0.2.0 to run this task!")
|
||||
self.print_and_log("Version not match, please use EasySpider v0.2.0 to run this task!")
|
||||
self.browser.quit()
|
||||
sys.exit()
|
||||
try:
|
||||
self.save_threshold = service["saveThreshold"] # 保存最低阈值
|
||||
except:
|
||||
self.save_threshold = 10
|
||||
try:
|
||||
self.links = list(
|
||||
filter(isnotnull, service["links"].split("\n"))) # 要执行的link的列表
|
||||
except:
|
||||
|
||||
if self.task_version >= "0.3.1": # 0.3.1及以上版本以上的EasySpider兼容从0.3.1版本开始的所有版本
|
||||
pass
|
||||
elif self.task_version != version: # 0.3.1以下版本的EasySpider不兼容0.3.1及以上版本的EasySpider
|
||||
self.print_and_log(f"版本不一致,请使用{self.task_version}版本的EasySpider运行该任务!")
|
||||
self.print_and_log(f"Version not match, please use EasySpider {self.task_version} to run this task!")
|
||||
self.browser.quit()
|
||||
sys.exit()
|
||||
|
||||
service_links = service.get("links")
|
||||
if service_links:
|
||||
self.links = list(filter(isnotnull, service_links.split("\n"))) # 要执行的link的列表
|
||||
else:
|
||||
self.links = list(filter(isnotnull, service["url"])) # 要执行的link
|
||||
|
||||
self.OUTPUT = [] # 采集的数据
|
||||
try:
|
||||
self.dataWriteMode = service["dataWriteMode"] # 数据写入模式,1为追加,2为覆盖
|
||||
except:
|
||||
self.dataWriteMode = 1
|
||||
if self.outputFormat == "csv" or self.outputFormat == "txt" or self.outputFormat == "xlsx" or self.outputFormat == "json":
|
||||
if self.dataWriteMode == 2 and os.path.exists("Data/Task_" + str(self.id) + "/" + self.saveName + '.' + self.outputFormat):
|
||||
os.remove("Data/Task_" + str(self.id) + "/" + self.saveName + '.' + self.outputFormat)
|
||||
self.writeMode = 1 # 写入模式,0为新建,1为追加
|
||||
if self.outputFormat == "csv" or self.outputFormat == "txt" or self.outputFormat == "xlsx":
|
||||
if not os.path.exists("Data/Task_" + str(self.id) + "/" + self.saveName + '.' + self.outputFormat):
|
||||
if self.outputFormat in ["csv", "txt", "xlsx", "json"]:
|
||||
if os.path.exists("Data/Task_" + str(self.id) + "/" + self.saveName + '.' + self.outputFormat):
|
||||
if self.dataWriteMode == DataWriteMode.Cover.value:
|
||||
os.remove("Data/Task_" + str(self.id) + "/" + self.saveName + '.' + self.outputFormat)
|
||||
elif self.dataWriteMode == DataWriteMode.Rename.value:
|
||||
i = 2
|
||||
while os.path.exists("Data/Task_" + str(self.id) + "/" + self.saveName + '_' + str(i) + '.' + self.outputFormat):
|
||||
i = i + 1
|
||||
self.saveName = self.saveName + '_' + str(i)
|
||||
self.print_and_log("文件已存在,已重命名为", self.saveName)
|
||||
self.writeMode = WriteMode.Create.value # 写入模式,0为新建,1为追加
|
||||
if self.outputFormat in ['csv', 'txt', 'xlsx']:
|
||||
if not os.path.exists(f"Data/Task_{str(self.id)}/{self.saveName}.{self.outputFormat}"):
|
||||
self.OUTPUT.append([]) # 添加表头
|
||||
self.writeMode = 0
|
||||
self.writeMode = WriteMode.Create.value
|
||||
elif self.outputFormat == "json":
|
||||
self.writeMode = 3 # JSON模式无需判断是否存在文件
|
||||
self.writeMode = WriteMode.Json.value # JSON模式无需判断是否存在文件
|
||||
elif self.outputFormat == "mysql":
|
||||
self.mysql = myMySQL(config["mysql_config_path"])
|
||||
self.mysql.create_table(self.saveName, service["outputParameters"], remove_if_exists=self.dataWriteMode == 2)
|
||||
self.writeMode = 2
|
||||
if self.writeMode == 0:
|
||||
self.mysql.create_table(self.saveName, service["outputParameters"],
|
||||
remove_if_exists=self.dataWriteMode == DataWriteMode.Cover.value)
|
||||
self.writeMode = WriteMode.MySQL.value # MySQL模式
|
||||
|
||||
if self.writeMode == WriteMode.Create.value:
|
||||
self.print_and_log("新建模式|Create Mode")
|
||||
elif self.writeMode == 1:
|
||||
elif self.writeMode == WriteMode.Append.value:
|
||||
self.print_and_log("追加模式|Append Mode")
|
||||
elif self.writeMode == 2:
|
||||
elif self.writeMode == WriteMode.MySQL.value:
|
||||
self.print_and_log("MySQL模式|MySQL Mode")
|
||||
elif self.writeMode == 3:
|
||||
elif self.writeMode == WriteMode.Json.value:
|
||||
self.print_and_log("JSON模式|JSON Mode")
|
||||
|
||||
self.containJudge = service["containJudge"] # 是否含有判断语句
|
||||
self.outputParameters = {}
|
||||
self.service = service
|
||||
@ -224,191 +222,140 @@ class BrowserThread(Thread):
|
||||
if param["name"] not in self.outputParameters.keys():
|
||||
self.outputParameters[param["name"]] = ""
|
||||
self.dataNotFoundKeys[param["name"]] = False
|
||||
try:
|
||||
self.outputParametersTypes.append(param["type"])
|
||||
except:
|
||||
self.outputParametersTypes.append("text")
|
||||
try:
|
||||
self.outputParametersRecord.append(
|
||||
bool(param["recordASField"]))
|
||||
except:
|
||||
self.outputParametersRecord.append(True)
|
||||
self.outputParametersTypes.append(param.get("type", "text"))
|
||||
self.outputParametersRecord.append(bool(param.get("recordASField", True)))
|
||||
# 文件叠加的时候不添加表头
|
||||
if self.outputFormat == "csv" or self.outputFormat == "txt" or self.outputFormat == "xlsx":
|
||||
if self.writeMode == 0:
|
||||
self.OUTPUT[0].append(param["name"])
|
||||
if self.outputFormat in ["csv", "txt", "xlsx"] and self.writeMode == WriteMode.Create.value:
|
||||
self.OUTPUT[0].append(param["name"])
|
||||
self.urlId = 0 # 全局记录变量
|
||||
self.preprocess() # 预处理,优化提取数据流程
|
||||
try:
|
||||
self.inputExcel = service["inputExcel"] # 输入Excel
|
||||
except:
|
||||
self.inputExcel = ""
|
||||
self.inputExcel = service.get("inputExcel", "") # 输入Excel
|
||||
self.readFromExcel() # 读取Excel获得参数值
|
||||
|
||||
# 检测如果没有复杂的操作,优化提取数据流程
|
||||
def preprocess(self):
|
||||
for node in self.procedure:
|
||||
try:
|
||||
iframe = node["parameters"]["iframe"]
|
||||
except:
|
||||
node["parameters"]["iframe"] = False
|
||||
for index_node, node in enumerate(self.procedure):
|
||||
parameters: dict = node["parameters"]
|
||||
iframe = parameters.get('iframe')
|
||||
option = node["option"]
|
||||
|
||||
try:
|
||||
node["parameters"]["xpath"] = lowercase_tags_in_xpath(
|
||||
node["parameters"]["xpath"])
|
||||
except:
|
||||
pass
|
||||
try:
|
||||
node["parameters"]["waitElementIframeIndex"] = int(
|
||||
node["parameters"]["waitElementIframeIndex"])
|
||||
except:
|
||||
node["parameters"]["waitElement"] = ""
|
||||
node["parameters"]["waitElementTime"] = 10
|
||||
node["parameters"]["waitElementIframeIndex"] = 0
|
||||
if node["option"] == 1: # 打开网页操作
|
||||
try:
|
||||
cookies = node["parameters"]["cookies"]
|
||||
except:
|
||||
node["parameters"]["cookies"] = ""
|
||||
elif node["option"] == 2: # 点击操作
|
||||
try:
|
||||
alertHandleType = node["parameters"]["alertHandleType"]
|
||||
except:
|
||||
node["parameters"]["alertHandleType"] = 0
|
||||
if node["parameters"]["useLoop"]:
|
||||
parameters["iframe"] = False if not iframe else parameters.get('iframe', False)
|
||||
if parameters.get("xpath"):
|
||||
parameters["xpath"] = lowercase_tags_in_xpath(parameters["xpath"])
|
||||
|
||||
if parameters.get("waitElementIframeIndex"):
|
||||
parameters["waitElementIframeIndex"] = int(parameters["waitElementIframeIndex"])
|
||||
else:
|
||||
parameters["waitElement"] = ""
|
||||
parameters["waitElementTime"] = 10
|
||||
parameters["waitElementIframeIndex"] = 0
|
||||
|
||||
if option == GraphOption.Get.value: # 打开网页操作
|
||||
parameters["cookies"] = parameters.get("cookies", "")
|
||||
elif option == GraphOption.Click.value: # 点击操作
|
||||
parameters["alertHandleType"] = parameters.get("alertHandleType", 0)
|
||||
if parameters.get("useLoop"):
|
||||
if self.task_version <= "0.3.5":
|
||||
# 0.3.5及以下版本的EasySpider下的循环点击不支持相对XPath
|
||||
node["parameters"]["xpath"] = ""
|
||||
self.print_and_log("您的任务版本号为" + self.task_version +
|
||||
",循环点击不支持相对XPath写法,已自动切换为纯循环的XPath")
|
||||
elif node["option"] == 3: # 提取数据操作
|
||||
node["parameters"]["recordASField"] = 0
|
||||
try:
|
||||
params = node["parameters"]["params"]
|
||||
except:
|
||||
node["parameters"]["params"] = node["parameters"]["paras"] # 兼容0.5.0及以下版本的EasySpider
|
||||
params = node["parameters"]["params"]
|
||||
try:
|
||||
clear = node["parameters"]["clear"]
|
||||
except:
|
||||
node["parameters"]["clear"] = 0
|
||||
try:
|
||||
newLine = node["parameters"]["newLine"]
|
||||
except:
|
||||
node["parameters"]["newLine"] = 1
|
||||
parameters["xpath"] = ""
|
||||
self.print_and_log(f"您的任务版本号为{self.task_version},循环点击不支持相对XPath写法,已自动切换为纯循环的XPath")
|
||||
elif option == GraphOption.Extract.value: # 提取数据操作
|
||||
parameters["recordASField"] = 0
|
||||
parameters["params"] = parameters.get("params", parameters.get("paras")) # 兼容0.5.0及以下版本的EasySpider
|
||||
parameters["clear"] = parameters.get("clear", 0)
|
||||
parameters["newLine"] = parameters.get("newLine", 1)
|
||||
|
||||
params = parameters["params"]
|
||||
for param in params:
|
||||
try:
|
||||
iframe = param["iframe"]
|
||||
except:
|
||||
param["iframe"] = False
|
||||
try:
|
||||
param["iframe"] = param.get("iframe", False)
|
||||
|
||||
if param.get("relativeXPath"):
|
||||
param["relativeXPath"] = lowercase_tags_in_xpath(param["relativeXPath"])
|
||||
except:
|
||||
pass
|
||||
try:
|
||||
node["parameters"]["recordASField"] = param["recordASField"]
|
||||
except:
|
||||
node["parameters"]["recordASField"] = 1
|
||||
try:
|
||||
splitLine = int(param["splitLine"])
|
||||
except:
|
||||
param["splitLine"] = 0
|
||||
if param["contentType"] == 8:
|
||||
self.print_and_log(
|
||||
"默认的ddddocr识别功能如果觉得不好用,可以自行修改源码get_content函数->contentType == 8的位置换成自己想要的OCR模型然后自己编译运行;或者可以先设置采集内容类型为“元素截图”把图片保存下来,然后用自定义操作调用自己写的程序,程序的功能是读取这个最新生成的图片,然后用好用的模型,如PaddleOCR把图片识别出来,然后把返回值返回给程序作为参数输出。")
|
||||
self.print_and_log(
|
||||
"If you think the default ddddocr function is not good enough, you can modify the source code get_content function -> contentType == 8 position to your own OCR model and then compile and run it; or you can first set the content type of the crawler to \"Element Screenshot\" to save the picture, and then call your own program with custom operations. The function of the program is to read the latest generated picture, then use a good model, such as PaddleOCR to recognize the picture, and then return the return value as a parameter output to the program.")
|
||||
|
||||
parameters["recordASField"] = param.get("recordASField", 1)
|
||||
|
||||
param["splitLine"] = 0 if not param.get("splitLine") else param.get("splitLine")
|
||||
|
||||
if param.get("contentType") == 8:
|
||||
self.print_and_log("默认的ddddocr识别功能如果觉得不好用,可以自行修改源码get_content函数->contentType =="
|
||||
"8的位置换成自己想要的OCR模型然后自己编译运行;或者可以先设置采集内容类型为“元素截图”把图片"
|
||||
"保存下来,然后用自定义操作调用自己写的程序,程序的功能是读取这个最新生成的图片,然后用好用"
|
||||
"的模型,如PaddleOCR把图片识别出来,然后把返回值返回给程序作为参数输出。")
|
||||
self.print_and_log("If you think the default ddddocr function is not good enough, you can "
|
||||
"modify the source code get_content function -> contentType == 8 position "
|
||||
"to your own OCR model and then compile and run it; or you can first set "
|
||||
"the content type of the crawler to \"Element Screenshot\" to save the "
|
||||
"picture, and then call your own program with custom operations. The "
|
||||
"function of the program is to read the latest generated picture, then use "
|
||||
"a good model, such as PaddleOCR to recognize the picture, and then return "
|
||||
"the return value as a parameter output to the program.")
|
||||
param["optimizable"] = detect_optimizable(param)
|
||||
elif node["option"] == 4: # 输入文字
|
||||
try:
|
||||
index = node["parameters"]["index"] # 索引值
|
||||
except:
|
||||
node["parameters"]["index"] = 0
|
||||
elif node["option"] == 5: # 自定义操作
|
||||
try:
|
||||
clear = node["parameters"]["clear"]
|
||||
except:
|
||||
node["parameters"]["clear"] = 0
|
||||
try:
|
||||
newLine = node["parameters"]["newLine"]
|
||||
except:
|
||||
node["parameters"]["newLine"] = 1
|
||||
elif node["option"] == 7: # 移动到元素
|
||||
if node["parameters"]["useLoop"]:
|
||||
if self.task_version <= "0.3.5":
|
||||
# 0.3.5及以下版本的EasySpider下的循环点击不支持相对XPath
|
||||
node["parameters"]["xpath"] = ""
|
||||
self.print_and_log("您的任务版本号为" + self.task_version +
|
||||
",循环点击不支持相对XPath写法,已自动切换为纯循环的XPath")
|
||||
elif node["option"] == 8: # 循环操作
|
||||
try:
|
||||
exitElement = node["parameters"]["exitElement"]
|
||||
if exitElement == "":
|
||||
node["parameters"]["exitElement"] = "//body"
|
||||
except:
|
||||
node["parameters"]["exitElement"] = "//body"
|
||||
node["parameters"]["quickExtractable"] = False # 是否可以快速提取
|
||||
try:
|
||||
skipCount = node["parameters"]["skipCount"]
|
||||
except:
|
||||
node["parameters"]["skipCount"] = 0
|
||||
elif option == GraphOption.Input.value: # 输入文字
|
||||
parameters['index'] = parameters.get('index', 0)
|
||||
elif option == GraphOption.Custom.value: # 自定义操作
|
||||
parameters['clear'] = parameters.get('clear', 0)
|
||||
parameters['newLine'] = parameters.get('newLine', 1)
|
||||
elif option == GraphOption.Move.value: # 移动到元素
|
||||
if parameters.get('useLoop'):
|
||||
if self.task_version <= "0.3.5": # 0.3.5及以下版本的EasySpider下的循环点击不支持相对XPath
|
||||
parameters["xpath"] = ""
|
||||
self.print_and_log(f"您的任务版本号为{self.task_version},循环点击不支持相对XPath写法,已自动切换为纯循环的XPath")
|
||||
elif option == GraphOption.Loop.value: # 循环操作
|
||||
parameters['exitElement'] = "//body" if not parameters.get('exitElement') or parameters.get('exitElement') == "" else parameters.get('exitElement')
|
||||
parameters["quickExtractable"] = False # 是否可以快速提取
|
||||
parameters['skipCount'] = parameters.get('skipCount', 0)
|
||||
|
||||
# 如果(不)固定元素列表循环中只有一个提取数据操作,且提取数据操作的提取内容为元素截图,那么可以快速提取
|
||||
if len(node["sequence"]) == 1 and self.procedure[node["sequence"][0]]["option"] == 3 and (int(node["parameters"]["loopType"]) == 1 or int(node["parameters"]["loopType"]) == 2):
|
||||
try:
|
||||
params = self.procedure[node["sequence"][0]]["parameters"]["params"]
|
||||
except:
|
||||
params = self.procedure[node["sequence"][0]]["parameters"]["paras"] # 兼容0.5.0及以下版本的EasySpider
|
||||
try:
|
||||
waitElement = self.procedure[node["sequence"][0]]["parameters"]["waitElement"]
|
||||
except:
|
||||
waitElement = ""
|
||||
if node["parameters"]["iframe"]:
|
||||
node["parameters"]["quickExtractable"] = False # 如果是iframe,那么不可以快速提取
|
||||
if len(node["sequence"]) == 1 and self.procedure[node["sequence"][0]]["option"] == 3 \
|
||||
and (int(node["parameters"]["loopType"]) == 1 or int(node["parameters"]["loopType"]) == 2):
|
||||
params = self.procedure[node["sequence"][0]].get("parameters").get("params")
|
||||
if not params:
|
||||
params = self.procedure[node["sequence"][0]]["parameters"]["paras"] # 兼容0.5.0及以下版本的EasySpider
|
||||
|
||||
waitElement = self.procedure[node["sequence"][0]]["parameters"].get("waitElement", "")
|
||||
|
||||
if parameters["iframe"]:
|
||||
parameters["quickExtractable"] = False # 如果是iframe,那么不可以快速提取
|
||||
else:
|
||||
node["parameters"]["quickExtractable"] = True # 先假设可以快速提取
|
||||
if node["parameters"]["skipCount"] > 0:
|
||||
node["parameters"]["quickExtractable"] = False # 如果有跳过的元素,那么不可以快速提取
|
||||
parameters["quickExtractable"] = True # 先假设可以快速提取
|
||||
|
||||
if parameters["skipCount"] > 0:
|
||||
parameters["quickExtractable"] = False # 如果有跳过的元素,那么不可以快速提取
|
||||
|
||||
for param in params:
|
||||
optimizable = detect_optimizable(param, ignoreWaitElement=False, waitElement=waitElement)
|
||||
try:
|
||||
iframe = param["iframe"]
|
||||
except:
|
||||
param["iframe"] = False
|
||||
if param["iframe"] and not param["relative"]: # 如果是iframe,那么不可以快速提取
|
||||
param['iframe'] = param.get('iframe', False)
|
||||
if param["iframe"] and not param["relative"]: # 如果是iframe,那么不可以快速提取
|
||||
optimizable = False
|
||||
if not optimizable: # 如果有一个不满足优化条件,那么就不能快速提取
|
||||
node["parameters"]["quickExtractable"] = False
|
||||
if not optimizable: # 如果有一个不满足优化条件,那么就不能快速提取
|
||||
parameters["quickExtractable"] = False
|
||||
break
|
||||
if node["parameters"]["quickExtractable"]:
|
||||
self.print_and_log("循环操作<" + node["title"] + ">可以快速提取数据")
|
||||
self.print_and_log("Loop operation <" + node["title"] + "> can extract data quickly")
|
||||
try:
|
||||
node["parameters"]["clear"] = self.procedure[node["sequence"][0]]["parameters"]["clear"]
|
||||
except:
|
||||
node["parameters"]["clear"] = 0
|
||||
try:
|
||||
node["parameters"]["newLine"] = self.procedure[node["sequence"][0]]["parameters"]["newLine"]
|
||||
except:
|
||||
node["parameters"]["newLine"] = 1
|
||||
if int(node["parameters"]["loopType"]) == 1: # 不固定元素列表
|
||||
|
||||
if parameters["quickExtractable"]:
|
||||
self.print_and_log(f"循环操作<{node['title']}>可以快速提取数据")
|
||||
self.print_and_log(f"Loop operation <{node['title']}> can extract data quickly")
|
||||
parameters["clear"] = self.procedure[node["sequence"][0]]["parameters"].get("clear", 0)
|
||||
parameters["newLine"] = self.procedure[node["sequence"][0]]["parameters"].get("newLine", 1)
|
||||
|
||||
if int(node["parameters"]["loopType"]) == 1: # 不固定元素列表
|
||||
node["parameters"]["baseXPath"] = node["parameters"]["xpath"]
|
||||
elif int(node["parameters"]["loopType"]) == 2: # 固定元素列表
|
||||
elif int(node["parameters"]["loopType"]) == 2: # 固定元素列表
|
||||
node["parameters"]["baseXPath"] = node["parameters"]["pathList"]
|
||||
node["parameters"]["quickParams"] = []
|
||||
for param in params:
|
||||
content_type = ""
|
||||
if param["relativeXPath"].find("/@href") >= 0 or param["relativeXPath"].find("/text()") >= 0 or param["relativeXPath"].find(
|
||||
"::text()") >= 0:
|
||||
if param["relativeXPath"].find("/@href") >= 0 or param["relativeXPath"].find("/text()") >= 0 \
|
||||
or param["relativeXPath"].find("::text()") >= 0:
|
||||
content_type = ""
|
||||
elif param["nodeType"] == 2:
|
||||
content_type = "//@href"
|
||||
elif param["nodeType"] == 4: # 图片链接
|
||||
elif param["nodeType"] == 4: # 图片链接
|
||||
content_type = "//@src"
|
||||
elif param["contentType"] == 1:
|
||||
content_type = "/text()"
|
||||
elif param["contentType"] == 0:
|
||||
content_type = "//text()"
|
||||
if param["relative"]: # 如果是相对XPath
|
||||
if param["relative"]: # 如果是相对XPath
|
||||
xpath = "." + param["relativeXPath"] + content_type
|
||||
else:
|
||||
xpath = param["relativeXPath"] + content_type
|
||||
@ -422,6 +369,7 @@ class BrowserThread(Thread):
|
||||
"nodeType": param["nodeType"],
|
||||
"default": param["default"],
|
||||
})
|
||||
self.procedure[index_node]["parameters"] = parameters
|
||||
self.print_and_log("预处理完成|Preprocess completed")
|
||||
|
||||
def readFromExcel(self):
|
||||
@ -521,7 +469,7 @@ class BrowserThread(Thread):
|
||||
"/", len(self.links))
|
||||
self.executeNode(0)
|
||||
self.urlId = self.urlId + 1
|
||||
files = os.listdir("Data/Task_" + str(self.id) + "/" + self.saveName)
|
||||
# files = os.listdir("Data/Task_" + str(self.id) + "/" + self.saveName)
|
||||
# 如果目录为空,则删除该目录
|
||||
# if not files:
|
||||
# os.rmdir("Data/Task_" + str(self.id) + "/" + self.saveName)
|
||||
@ -538,12 +486,16 @@ class BrowserThread(Thread):
|
||||
self.print_and_log(f"任务执行完毕,将在{quitWaitTime}秒后自动退出浏览器并清理临时用户目录,等待时间可在保存任务对话框中设置。")
|
||||
self.print_and_log(f"The task is completed, the browser will exit automatically and the temporary user directory will be cleaned up after {quitWaitTime} seconds, the waiting time can be set in the save task dialog.")
|
||||
time.sleep(quitWaitTime)
|
||||
self.browser.quit()
|
||||
try:
|
||||
self.browser.quit()
|
||||
except:
|
||||
pass
|
||||
self.print_and_log("正在清理临时用户目录……|Cleaning up temporary user directory...")
|
||||
try:
|
||||
shutil.rmtree(self.option["tmp_user_data_folder"])
|
||||
except:
|
||||
pass
|
||||
self.monitor_event.set()
|
||||
self.print_and_log("清理完成!|Clean up completed!")
|
||||
self.print_and_log("您现在可以安全的关闭此窗口了。|You can safely close this window now.")
|
||||
|
||||
@ -753,28 +705,32 @@ class BrowserThread(Thread):
|
||||
self.browser.set_script_timeout(max_wait_time)
|
||||
try:
|
||||
output = self.browser.execute_script(code)
|
||||
except:
|
||||
except Exception as e:
|
||||
output = ""
|
||||
self.recordLog("JavaScript execution failed")
|
||||
self.print_and_log("执行下面的代码时出错:" + code, ",错误为:", str(e))
|
||||
self.print_and_log("Error executing the following code:" + code, ", error is:", str(e))
|
||||
elif int(codeMode) == 2:
|
||||
self.recordLog("Execute JavaScript for element:" + code)
|
||||
self.recordLog("对元素执行JavaScript:" + code)
|
||||
self.browser.set_script_timeout(max_wait_time)
|
||||
try:
|
||||
output = self.browser.execute_script(code, element)
|
||||
except:
|
||||
except Exception as e:
|
||||
output = ""
|
||||
self.recordLog("JavaScript execution failed")
|
||||
self.print_and_log("执行下面的代码时出错:" + code, ",错误为:", str(e))
|
||||
self.print_and_log("Error executing the following code:" + code, ", error is:", str(e))
|
||||
elif int(codeMode) == 5:
|
||||
try:
|
||||
code = readCode(code)
|
||||
# global_namespace = globals().copy()
|
||||
# global_namespace["self"] = self
|
||||
output = exec(code)
|
||||
self.recordLog("执行下面的代码:" + code)
|
||||
self.recordLog("Execute the following code:" + code)
|
||||
except Exception as e:
|
||||
self.print_and_log("执行下面的代码时出错:" + code, ",错误为:", e)
|
||||
self.print_and_log("执行下面的代码时出错:" + code, ",错误为:", str(e))
|
||||
self.print_and_log("Error executing the following code:" +
|
||||
code, ", error is:", e)
|
||||
code, ", error is:", str(e))
|
||||
elif int(codeMode) == 6:
|
||||
try:
|
||||
code = readCode(code)
|
||||
@ -847,6 +803,23 @@ class BrowserThread(Thread):
|
||||
self.print_and_log("根据设置的自定义操作,任务已刷新页面|Task refreshed page according to custom operation")
|
||||
elif codeMode == 9: # 发送邮件
|
||||
send_email(node["parameters"]["emailConfig"])
|
||||
elif codeMode == 10: # 清空所有字段值
|
||||
self.clearOutputParameters()
|
||||
elif codeMode == 11: # 生成新的数据行
|
||||
line = new_line(self.outputParameters,
|
||||
self.maxViewLength, self.outputParametersRecord)
|
||||
self.OUTPUT.append(line)
|
||||
elif codeMode == 12: # 退出程序
|
||||
self.print_and_log("根据设置的自定义操作,任务已退出|Task exited according to custom operation")
|
||||
self.saveData(exit=True)
|
||||
self.browser.quit()
|
||||
self.print_and_log("正在清理临时用户目录……|Cleaning up temporary user directory...")
|
||||
try:
|
||||
shutil.rmtree(self.option["tmp_user_data_folder"])
|
||||
except:
|
||||
pass
|
||||
self.print_and_log("清理完成!|Clean up completed!")
|
||||
os._exit(0)
|
||||
else: # 0 1 5 6
|
||||
output = self.execute_code(
|
||||
codeMode, code, max_wait_time, iframe=params["iframe"])
|
||||
@ -1106,7 +1079,25 @@ class BrowserThread(Thread):
|
||||
self.recordLog(
|
||||
"判断条件内所有条件分支的条件都不满足|None of the conditions in the judgment condition are met")
|
||||
|
||||
def handleHistory(self, node, xpath, thisHistoryURL, thisHistoryLength, index, element=None, elements=None):
|
||||
def handleHistory(self, node, xpath, thisHandle, thisHistoryURL, thisHistoryLength, index, element=None, elements=None):
|
||||
try:
|
||||
changed_handle = self.browser.current_window_handle != thisHandle
|
||||
except: # 如果网页被意外关闭了的情况下
|
||||
self.browser.switch_to.window(
|
||||
self.browser.window_handles[-1])
|
||||
changed_handle = self.browser.window_handles[-1] != thisHandle
|
||||
if changed_handle: # 如果执行完一次循环之后标签页的位置发生了变化
|
||||
try:
|
||||
while True: # 一直关闭窗口直到当前标签页
|
||||
self.browser.close() # 关闭使用完的标签页
|
||||
self.browser.switch_to.window(
|
||||
self.browser.window_handles[-1])
|
||||
if self.browser.current_window_handle == thisHandle:
|
||||
break
|
||||
except Exception as e:
|
||||
self.print_and_log("关闭标签页发生错误:", e)
|
||||
self.print_and_log(
|
||||
"Error occurred while closing tab: ", e)
|
||||
if self.history["index"] != thisHistoryLength and self.history["handle"] == self.browser.current_window_handle: # 如果执行完一次循环之后历史记录发生了变化,注意当前页面的判断
|
||||
difference = thisHistoryLength - self.history["index"] # 计算历史记录变化差值
|
||||
self.browser.execute_script('history.go(' + str(difference) + ')') # 回退历史记录
|
||||
@ -1132,12 +1123,13 @@ class BrowserThread(Thread):
|
||||
if self.browser.current_url == thisHistoryURL or ti > thisHistoryLength: # 如果执行完一次循环之后网址发生了变化
|
||||
break
|
||||
time.sleep(2)
|
||||
if element == None: # 不固定元素列表
|
||||
element = self.browser.find_elements(By.XPATH, xpath, iframe=node["parameters"]["iframe"])
|
||||
else: # 固定元素列表
|
||||
element = self.browser.find_element(By.XPATH, xpath, iframe=node["parameters"]["iframe"])
|
||||
# if index > 0:
|
||||
# index -= 1 # 如果是data:开头的网址,就要重试一次
|
||||
if xpath != "":
|
||||
if element == None: # 不固定元素列表
|
||||
element = self.browser.find_elements(By.XPATH, xpath, iframe=node["parameters"]["iframe"])
|
||||
else: # 固定元素列表
|
||||
element = self.browser.find_element(By.XPATH, xpath, iframe=node["parameters"]["iframe"])
|
||||
# if index > 0:
|
||||
# index -= 1 # 如果是data:开头的网址,就要重试一次
|
||||
else:
|
||||
if element == None:
|
||||
element = elements
|
||||
@ -1156,6 +1148,14 @@ class BrowserThread(Thread):
|
||||
self.history["handle"] = thisHandle
|
||||
thisHistoryURL = self.browser.current_url
|
||||
# 快速提取处理
|
||||
# start = time.time()
|
||||
try:
|
||||
tree = html.fromstring(self.browser.page_source)
|
||||
except Exception as e:
|
||||
self.print_and_log("解析页面时出错,将切换普通提取模式|Error parsing page, will switch to normal extraction mode")
|
||||
node["parameters"]["quickExtractable"] = False
|
||||
# end = time.time()
|
||||
# print("解析页面秒数:", end - start)
|
||||
if node["parameters"]["quickExtractable"]:
|
||||
self.browser.switch_to.default_content() # 切换到主页面
|
||||
tree = html.fromstring(self.browser.page_source)
|
||||
@ -1321,25 +1321,7 @@ class BrowserThread(Thread):
|
||||
if self.BREAK:
|
||||
self.BREAK = False
|
||||
break
|
||||
try:
|
||||
changed_handle = self.browser.current_window_handle != thisHandle
|
||||
except: # 如果网页被意外关闭了的情况下
|
||||
self.browser.switch_to.window(
|
||||
self.browser.window_handles[-1])
|
||||
changed_handle = self.browser.window_handles[-1] != thisHandle
|
||||
if changed_handle: # 如果执行完一次循环之后标签页的位置发生了变化
|
||||
try:
|
||||
while True: # 一直关闭窗口直到当前标签页
|
||||
self.browser.close() # 关闭使用完的标签页
|
||||
self.browser.switch_to.window(
|
||||
self.browser.window_handles[-1])
|
||||
if self.browser.current_window_handle == thisHandle:
|
||||
break
|
||||
except Exception as e:
|
||||
self.print_and_log("关闭标签页发生错误:", e)
|
||||
self.print_and_log(
|
||||
"Error occurred while closing tab: ", e)
|
||||
index, elements = self.handleHistory(node, xpath, thisHistoryURL, thisHistoryLength, index, elements=elements)
|
||||
index, elements = self.handleHistory(node, xpath, thisHandle, thisHistoryURL, thisHistoryLength, index, elements=elements)
|
||||
if int(node["parameters"]["breakMode"]) > 0: # 如果设置了退出循环的脚本条件
|
||||
output = self.execute_code(int(
|
||||
node["parameters"]["breakMode"]) - 1, node["parameters"]["breakCode"],
|
||||
@ -1381,25 +1363,7 @@ class BrowserThread(Thread):
|
||||
if self.BREAK:
|
||||
self.BREAK = False
|
||||
break
|
||||
try:
|
||||
changed_handle = self.browser.current_window_handle != thisHandle
|
||||
except: # 如果网页被意外关闭了的情况下
|
||||
self.browser.switch_to.window(
|
||||
self.browser.window_handles[-1])
|
||||
changed_handle = self.browser.window_handles[-1] != thisHandle
|
||||
if changed_handle: # 如果执行完一次循环之后标签页的位置发生了变化
|
||||
try:
|
||||
while True: # 一直关闭窗口直到当前标签页
|
||||
self.browser.close() # 关闭使用完的标签页
|
||||
self.browser.switch_to.window(
|
||||
self.browser.window_handles[-1])
|
||||
if self.browser.current_window_handle == thisHandle:
|
||||
break
|
||||
except Exception as e:
|
||||
self.print_and_log("关闭标签页发生错误:", e)
|
||||
self.print_and_log(
|
||||
"Error occurred while closing tab: ", e)
|
||||
index, element = self.handleHistory(node, path, thisHistoryURL, thisHistoryLength, index, element=element)
|
||||
index, element = self.handleHistory(node, path, thisHandle, thisHistoryURL, thisHistoryLength, index, element=element)
|
||||
except NoSuchElementException:
|
||||
self.print_and_log("Loop element not found: ", path)
|
||||
self.print_and_log("找不到循环元素:", path)
|
||||
@ -1447,6 +1411,7 @@ class BrowserThread(Thread):
|
||||
code = get_output_code(output)
|
||||
if code <= 0:
|
||||
break
|
||||
index, _ = self.handleHistory(node, "", thisHandle, thisHistoryURL, thisHistoryLength, index)
|
||||
elif int(node["parameters"]["loopType"]) == 4: # 固定网址列表
|
||||
# tempList = node["parameters"]["textList"].split("\r\n")
|
||||
urlList = list(
|
||||
@ -1696,8 +1661,11 @@ class BrowserThread(Thread):
|
||||
try:
|
||||
actions = ActionChains(self.browser) # 实例化一个action对象
|
||||
if newTab == 1: # 在新标签页打开
|
||||
# Ctrl + Click
|
||||
actions.key_down(Keys.CONTROL).click(element).key_up(Keys.CONTROL).perform()
|
||||
if sys.platform == "darwin": # Mac
|
||||
actions.key_down(Keys.COMMAND).click(element).key_up(Keys.COMMAND).perform()
|
||||
else:
|
||||
# Ctrl + Click
|
||||
actions.key_down(Keys.CONTROL).click(element).key_up(Keys.CONTROL).perform()
|
||||
else:
|
||||
actions.click(element).perform()
|
||||
except Exception as e:
|
||||
@ -1715,6 +1683,21 @@ class BrowserThread(Thread):
|
||||
script = 'var result = document.evaluate(`' + path + \
|
||||
'`, document, null, XPathResult.ANY_TYPE, null);for(let i=0;i<arguments[0];i++){result.iterateNext();} result.iterateNext().click();'
|
||||
self.browser.execute_script(script, str(index)) # 用js的点击方法
|
||||
elif click_way == 2: # 双击
|
||||
try:
|
||||
actions = ActionChains(self.browser) # 实例化一个action对象
|
||||
actions.double_click(element).perform()
|
||||
except Exception as e:
|
||||
self.browser.execute_script("arguments[0].scrollIntoView();", element)
|
||||
try:
|
||||
actions = ActionChains(self.browser) # 实例化一个action对象
|
||||
actions.double_click(element).perform()
|
||||
except Exception as e:
|
||||
self.print_and_log(f"Selenium双击元素{path}失败,将尝试使用JavaScript双击")
|
||||
self.print_and_log(f"Failed to double click element {path} with Selenium, will try to double click with JavaScript")
|
||||
script = 'var result = document.evaluate(`' + path + \
|
||||
'`, document, null, XPathResult.ANY_TYPE, null);for(let i=0;i<arguments[0];i++){result.iterateNext();} result.iterateNext().click();'
|
||||
self.browser.execute_script(script, str(index)) # 用js的点击方法
|
||||
self.recordLog("点击元素|Click element: " + path)
|
||||
except TimeoutException:
|
||||
self.print_and_log(
|
||||
@ -1797,7 +1780,6 @@ class BrowserThread(Thread):
|
||||
self.print_and_log("History Length Error")
|
||||
self.history["index"] = 0
|
||||
self.scrollDown(param) # 根据参数配置向下滚动
|
||||
# rt.end()
|
||||
|
||||
def get_content(self, p, element):
|
||||
content = ""
|
||||
@ -1824,7 +1806,7 @@ class BrowserThread(Thread):
|
||||
downloadPic = 0
|
||||
if downloadPic == 1:
|
||||
download_image(self, content, "Data/Task_" +
|
||||
str(self.id) + "/" + self.saveName + "/", element)
|
||||
str(self.id) + "/" + self.saveName + "/images", element)
|
||||
else: # 普通节点
|
||||
if p["splitLine"] == 1:
|
||||
text = extract_text_from_html(element.get_attribute('outerHTML'))
|
||||
@ -1853,7 +1835,7 @@ class BrowserThread(Thread):
|
||||
downloadPic = 0
|
||||
if downloadPic == 1:
|
||||
download_image(self, content, "Data/Task_" +
|
||||
str(self.id) + "/" + self.saveName + "/", element)
|
||||
str(self.id) + "/" + self.saveName + "/images", element)
|
||||
else:
|
||||
command = 'var arr = [];\
|
||||
var content = arguments[0];\
|
||||
@ -1965,6 +1947,8 @@ class BrowserThread(Thread):
|
||||
content = element.get_attribute(attribute_name)
|
||||
except:
|
||||
content = ""
|
||||
elif p["contentType"] == 15: # 常量值
|
||||
content = p["JS"]
|
||||
if content == None:
|
||||
content = ""
|
||||
return content
|
||||
@ -2208,7 +2192,9 @@ if __name__ == '__main__':
|
||||
"server_address": "http://localhost:8074",
|
||||
"keyboard": True, # 是否监听键盘输入
|
||||
"pause_key": "p", # 暂停键
|
||||
"version": "0.6.0",
|
||||
"version": "0.6.3",
|
||||
"docker_driver": "",
|
||||
"user_folder": "",
|
||||
}
|
||||
c = Config(config)
|
||||
print(c)
|
||||
@ -2283,7 +2269,9 @@ if __name__ == '__main__':
|
||||
|
||||
options.add_argument(
|
||||
"--disable-blink-features=AutomationControlled") # TMALL 反扒
|
||||
|
||||
# 阻止http -> https的重定向
|
||||
options.add_argument("--disable-features=CrossSiteDocumentBlockingIfIsolating,CrossSiteDocumentBlockingAlways,IsolateOrigins,site-per-process")
|
||||
options.add_argument("--disable-web-security") # 禁用同源策略
|
||||
options.add_argument('-ignore-certificate-errors')
|
||||
options.add_argument('-ignore -ssl-errors')
|
||||
|
||||
@ -2302,35 +2290,43 @@ if __name__ == '__main__':
|
||||
os.mkdir(tmp_user_folder_parent)
|
||||
characters = string.ascii_letters + string.digits
|
||||
for i in range(len(c.ids)):
|
||||
id = c.ids[i]
|
||||
# 从字符集中随机选择字符构成字符串
|
||||
random_string = ''.join(random.choice(characters) for i in range(10))
|
||||
tmp_user_data_folder = os.path.join(tmp_user_folder_parent, "user_data_" + str(id) + "_" + str(time.time()).replace(".","") + "_" + random_string)
|
||||
tmp_options[i]["tmp_user_data_folder"] = tmp_user_data_folder
|
||||
if os.path.exists(tmp_user_data_folder):
|
||||
try:
|
||||
shutil.rmtree(tmp_user_data_folder)
|
||||
except:
|
||||
pass
|
||||
print(f"Copying user data folder to: {tmp_user_data_folder}, please wait...")
|
||||
print(f"正在复制用户信息目录到: {tmp_user_data_folder},请稍等...")
|
||||
if os.path.exists(absolute_user_data_folder):
|
||||
try:
|
||||
shutil.copytree(absolute_user_data_folder, tmp_user_data_folder)
|
||||
print("User data folder copied successfully, if you exit the program before it finishes, please delete the temporary user data folder manually.")
|
||||
print("用户信息目录复制成功,如果程序在运行过程中被手动退出,请手动删除临时用户信息目录。")
|
||||
except:
|
||||
tmp_user_data_folder = absolute_user_data_folder
|
||||
print("Copy user data folder failed, use the original folder.")
|
||||
print("复制用户信息目录失败,使用原始目录。")
|
||||
else:
|
||||
tmp_user_data_folder = absolute_user_data_folder
|
||||
print("Cannot find user data folder, create a new folder.")
|
||||
print("未找到用户信息目录,创建新目录。")
|
||||
options = tmp_options[i]["options"]
|
||||
options.add_argument(
|
||||
f'--user-data-dir={tmp_user_data_folder}') # TMALL 反扒
|
||||
options.add_argument("--profile-directory=Default")
|
||||
if c.user_folder == "":
|
||||
id = c.ids[i]
|
||||
# 从字符集中随机选择字符构成字符串
|
||||
random_string = ''.join(random.choice(characters) for i in range(10))
|
||||
tmp_user_data_folder = os.path.join(tmp_user_folder_parent, "user_data_" + str(id) + "_" + str(time.time()).replace(".","") + "_" + random_string)
|
||||
tmp_options[i]["tmp_user_data_folder"] = tmp_user_data_folder
|
||||
if os.path.exists(tmp_user_data_folder):
|
||||
try:
|
||||
shutil.rmtree(tmp_user_data_folder)
|
||||
except:
|
||||
pass
|
||||
print(f"Copying user data folder to: {tmp_user_data_folder}, please wait...")
|
||||
print(f"正在复制用户信息目录到: {tmp_user_data_folder},请稍等...")
|
||||
if os.path.exists(absolute_user_data_folder):
|
||||
try:
|
||||
shutil.copytree(absolute_user_data_folder, tmp_user_data_folder)
|
||||
print("User data folder copied successfully, if you exit the program before it finishes, please delete the temporary user data folder manually.")
|
||||
print("用户信息目录复制成功,如果程序在运行过程中被手动退出,请手动删除临时用户信息目录。")
|
||||
except:
|
||||
tmp_user_data_folder = absolute_user_data_folder
|
||||
print("Copy user data folder failed, use the original folder.")
|
||||
print("复制用户信息目录失败,使用原始目录。")
|
||||
else:
|
||||
tmp_user_data_folder = absolute_user_data_folder
|
||||
print("Cannot find user data folder, create a new folder.")
|
||||
print("未找到用户信息目录,创建新目录。")
|
||||
options.add_argument(
|
||||
f'--user-data-dir={tmp_user_data_folder}') # TMALL 反扒
|
||||
print(f"Use local user data folder: {tmp_user_data_folder}")
|
||||
print(f"使用本地用户信息目录: {tmp_user_data_folder}")
|
||||
else:
|
||||
options.add_argument(
|
||||
f'--user-data-dir={c.user_folder}')
|
||||
print(f"Use specifed user data folder: {c.user_folder}", ", please note if you are using docker, this user folder path should be the path inside the docker container.")
|
||||
print(f"使用指定的用户信息目录: {c.user_folder}", ",请注意如果您正在使用docker,此用户文件夹路径应是容器内的路径。")
|
||||
print(
|
||||
"如果报错Selenium.common.exceptions.WebDriverException: Message: unknown error: Chrome failed to start: exited abnormally,说明有之前运行的Chrome实例没有正常关闭,请关闭之前打开的所有Chrome实例后再运行程序即可。")
|
||||
print(
|
||||
@ -2343,9 +2339,13 @@ if __name__ == '__main__':
|
||||
print("id: ", id)
|
||||
if c.read_type == "remote":
|
||||
print("remote")
|
||||
content = requests.get(
|
||||
try:
|
||||
content = requests.get(
|
||||
c.server_address + "/queryExecutionInstance?id=" + str(id))
|
||||
service = json.loads(content.text) # 加载服务信息
|
||||
service = json.loads(content.text) # 加载服务信息
|
||||
except:
|
||||
print("Cannot connect to the server, please make sure that the EasySpider Main Program is running, or you can change the --read_type parameter to 'local' to read the task information from the local task file without keeping the EasySpider Main Program running.")
|
||||
print("无法连接到服务器,请确保EasySpider主程序正在运行,或者您可以将--read_type参数更改为'local',以实现从本地任务文件中读取任务信息而无需保持EasySpider主程序运行。")
|
||||
else:
|
||||
print("local")
|
||||
local_folder = os.path.join(os.getcwd(), "execution_instances")
|
||||
@ -2370,8 +2370,8 @@ if __name__ == '__main__':
|
||||
cloudflare = 0
|
||||
if cloudflare == 0:
|
||||
options.add_argument('log-level=3') # 隐藏日志
|
||||
path = os.path.join(os.path.abspath("./"), "Data", "Task_" + str(id))
|
||||
print("Data path:", path)
|
||||
path = os.path.join(os.path.abspath("./"), "Data", "Task_" + str(id), "files")
|
||||
print("文件下载路径|File Download path:", path)
|
||||
options.add_experimental_option("prefs", {
|
||||
# 设置文件下载路径
|
||||
"download.default_directory": path,
|
||||
@ -2396,8 +2396,17 @@ if __name__ == '__main__':
|
||||
except:
|
||||
browser = "chrome"
|
||||
if browser == "chrome":
|
||||
selenium_service = Service(executable_path=driver_path)
|
||||
browser_t = MyChrome(service=selenium_service, options=options)
|
||||
if c.docker_driver == "":
|
||||
print("Using local driver")
|
||||
selenium_service = Service(executable_path=driver_path)
|
||||
browser_t = MyChrome(service=selenium_service, options=options, mode='local_driver')
|
||||
else:
|
||||
print("Using remote driver")
|
||||
# Use docker driver, default address is http://localhost:4444/wd/hub
|
||||
# Headless mode
|
||||
# options.add_argument("--headless")
|
||||
# print("Headless mode")
|
||||
browser_t = MyChrome(command_executor=c.docker_driver, options=options, mode='remote_driver')
|
||||
elif browser == "edge":
|
||||
from selenium.webdriver.edge.service import Service as EdgeService
|
||||
from selenium.webdriver.edge.options import Options as EdgeOptions
|
||||
@ -2458,6 +2467,7 @@ if __name__ == '__main__':
|
||||
# print("Passing the Cloudflare verification mode is sometimes unstable. If the verification fails, you need to try again every few minutes, or you can change to a new user information folder and then execute the task.")
|
||||
# 使用监听器监听键盘输入
|
||||
try:
|
||||
from pynput.keyboard import Key, Listener
|
||||
if c.keyboard:
|
||||
with Listener(on_press=on_press_creator(press_time, event),
|
||||
on_release=on_release_creator(event, press_time)) as listener:
|
||||
|
@ -1 +1,50 @@
|
||||
#!/bin/bash
|
||||
|
||||
# 使用 lsb_release 获取系统信息
|
||||
os_name=$(lsb_release -si)
|
||||
os_version=$(lsb_release -sr)
|
||||
|
||||
# 提取主版本号副版本号
|
||||
major_version=$(echo $os_version | cut -d'.' -f1)
|
||||
minor_version=$(echo $os_version | cut -d'.' -f2)
|
||||
|
||||
# 检查是否为Ubuntu且版本大于等于24.04
|
||||
if [ "$os_name" == "Ubuntu" ] && [ "$major_version" -gt 24 ] || { [ "$major_version" -eq 24 ]; }; then
|
||||
# 要检查的文件路径
|
||||
file_path="./EasySpider/chrome-sandbox"
|
||||
|
||||
# 检查文件是否存在
|
||||
if [ ! -e "$file_path" ]; then
|
||||
echo "File Not Exist!"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# 获取文件的拥有者
|
||||
owner=$(stat -c %U "$file_path")
|
||||
|
||||
# 获取文件的权限
|
||||
permissions=$(stat -c %a "$file_path")
|
||||
|
||||
# 检查拥有者是否为root且权限是否为4755
|
||||
if [ "$owner" != "root" ] || [ "$permissions" != "4755" ]; then
|
||||
echo "这是你第一次在该Ubuntu系统上使用EasySpider,请在下方输入密码来调整文件权限以使用EasySpider:"
|
||||
echo "This is the first time you use EasySpider in this Ubuntu system, please change your permission of the software by input your password below (should have root/sudo permission):"
|
||||
sudo chown root:root "$file_path"
|
||||
sudo chmod 4755 "$file_path"
|
||||
sudo chown root:root "./EasySpider/resources/app/chrome_linux64/chrome-sandbox"
|
||||
sudo chmod 4755 "./EasySpider/resources/app/chrome_linux64/chrome-sandbox"
|
||||
fi
|
||||
else
|
||||
echo "如果报错“The SUID sandbox helper binary was found, but is not configured correctly”,请尝试执行以下命令后再次运行EasySpider:"
|
||||
echo "If you encounter the error message “The SUID sandbox helper binary was found, but is not configured correctly”, please try run the following commands and run EasySpider again:"
|
||||
echo ""
|
||||
echo "sudo chown root:root ./EasySpider/chrome-sandbox"
|
||||
echo "sudo chmod 4755 ./EasySpider/chrome-sandbox"
|
||||
echo "sudo chown root:root ./EasySpider/resources/app/chrome_linux64/chrome-sandbox"
|
||||
echo "sudo chmod 4755 ./EasySpider/resources/app/chrome_linux64/chrome-sandbox"
|
||||
echo ""
|
||||
echo ""
|
||||
fi
|
||||
|
||||
|
||||
./EasySpider/EasySpider
|
||||
|
@ -23,7 +23,7 @@ For more complex operations, please download the source code and compile it for
|
||||
"""
|
||||
|
||||
# 请在下面编写你的代码,不要有代码缩进!!! | Please write your code below, do not indent the code!!!
|
||||
|
||||
print(globals())
|
||||
# 导包 | Import packages
|
||||
from selenium.common.exceptions import ElementClickInterceptedException
|
||||
|
||||
@ -56,3 +56,20 @@ finally:
|
||||
print("All parameters:", self.outputParameters)
|
||||
print(test(3))
|
||||
print("执行完毕|Execution completed")
|
||||
|
||||
import time
|
||||
time.sleep(3)
|
||||
|
||||
def new_line(outputParameters, maxViewLength, record):
|
||||
line = []
|
||||
print("Use this function to print a new line in the console")
|
||||
i = 0
|
||||
for value in outputParameters.values():
|
||||
line.append(value)
|
||||
if record[i]:
|
||||
print(value[:maxViewLength], " ", end="")
|
||||
i += 1
|
||||
print("")
|
||||
return line
|
||||
|
||||
new_line(self.outputParameters, 10, [True, True, True, True, True, True, True, True, True, True, True, True, True, True, True])
|
@ -4,82 +4,14 @@ Then EasySpider will be opened, and don't close the terminal when running EasySp
|
||||
|
||||
Official Site: https://www.easyspider.net
|
||||
|
||||
Welcome to promote this software to other friends.
|
||||
Welcome to promote this software to other friends and star our Github Repository!
|
||||
|
||||
This version is for Ubuntu 20.04, Debian, Deepin x64 and above.
|
||||
|
||||
The software's open-source code repository on GitHub: https://github.com/NaiboWang/EasySpider
|
||||
|
||||
Official documentation can be found at: https://github.com/NaiboWang/EasySpider/wiki
|
||||
|
||||
Video Tutorial: https://youtube.com/playlist?list=PL0kEFEkWrT7mt9MUlEBV2DTo1QsaanUTp
|
||||
|
||||
Tasks can be imported from other machines by simply placing the .json files from the "tasks" folder of those machines into the "tasks" folder of this directory. Similarly, execution instance files can be imported by copying the .json files from the "execution_instances" folder. Note that only files named with a number greater than 0 are supported in both folders.
|
||||
|
||||
|
||||
======Version Update Instruction======
|
||||
|
||||
Please see more new features for version greater than v0.3.2 at github release page: https://github.com/NaiboWang/EasySpider/releases
|
||||
|
||||
-----v0.3.2-----
|
||||
|
||||
## Update Instruction
|
||||
|
||||
1. Selected child element operations can delete fields and unmark deleted fields in real-time in the browser.
|
||||
2. Selecting child elements adds a selection mode that allows you to choose only the child elements that are present in all blocks or the child elements that are the same as the first selected block.
|
||||
3. In the text input and webpage open options, you can use the extracted field value as a variable for text input, represented by Field["field_name"].
|
||||
4. Files can be downloaded, such as PDF files.
|
||||
5. Fixed a bug where the software could display a blank screen for about 10 seconds after opening, making it usable in intranets, darknets, and any local network.
|
||||
6. Fixed a bug where the current page URL and title could not be extracted.
|
||||
7. Fixed a bug where OCR recognition could fail to extract information.
|
||||
8. Updated extraction logic to save locally every 10 records collected.
|
||||
9. When modifying a task, the default anchor position is set to after the last operation in the task flow.
|
||||
10. Updated Chrome version to 114.
|
||||
|
||||
-----v0.3.1-----
|
||||
|
||||
1. Advanced Operations:
|
||||
|
||||
- Custom scripts can be executed in the workflow, including executing JavaScript commands in the browser and invoking scripts at the operating system level. The command's return value can be obtained and recorded, greatly expanding the scope of operations.
|
||||
- Before and after each operation, you can specify a JavaScript command to be executed targeting the current located element.
|
||||
|
||||
2. Custom scripts are also supported in the conditions and loop conditions. The return value of the custom script determines the condition for the judgment of conditions and loops, greatly enhancing the flexibility of tasks. The ability to use the break statement within a loop is added, allowing custom operations to manipulate elements within the loop.
|
||||
|
||||
|
||||
3. Multiple XPath expressions are generated simultaneously for user selection, and the XPath Helper extension is pre-installed for XPath debugging.
|
||||
|
||||
4. Added the functionality to extract the background image URL of elements, current page title, and current page URL.
|
||||
|
||||
5. Added the capability to save screenshots of elements or entire web pages. This feature works best in headless mode.
|
||||
|
||||
6. Added the functionality to download images.
|
||||
|
||||
7. Added OCR recognition of elements. To use this feature, Tesseract library needs to be installed first: https://tesseract-ocr.github.io/tessdoc/Installation.html
|
||||
|
||||
8. Directly extract the return value of executing JavaScript code on elements, allowing for functionalities such as regular expression matching and obtaining the background color of elements.
|
||||
|
||||
9. Added the capability to switch dropdown options and extract the selected value and text of dropdown options.
|
||||
|
||||
10. Significantly improved user guidance and explanations to make the software more user-friendly. This includes instructions on handling iframe tags, explanations of parameter meanings for various options, and explanations on modifying the XPath for loop items, and more.
|
||||
|
||||
11. Added instructions on how to execute tasks from the command line.
|
||||
|
||||
12. Added headless mode configuration, allowing the software to run without a browser interface.
|
||||
|
||||
13. Fixed the issue where Chinese paths couldn't be recognized correctly when using user-configured browser modes.
|
||||
|
||||
14. Fixed the issue where the program would freeze when there was no unconditional branch in the conditional branching.
|
||||
|
||||
15. Fixed the issue where the input box would freeze after saving a task.
|
||||
|
||||
16. Added the option to set the maximum waiting time for page load in the "Open Page" and "Click element" operations.
|
||||
|
||||
17. Added the functionality to move the mouse to an element.
|
||||
|
||||
18. Displays a prompt when an element cannot be found.
|
||||
|
||||
19. Fixed the webpage scrolling bug.
|
||||
|
||||
20. The task name is initialized with the value of the page title upon the first visit.
|
||||
|
||||
21. Added version update prompts.
|
||||
|
||||
22. Added the information of the publisher as requested.
|
||||
|
||||
23. Updated Chrome version to 113.
|
||||
|
File diff suppressed because one or more lines are too long
@ -1 +1 @@
|
||||
{"id":228,"name":"[2312.02977] Exploring the nonclassical dynamics of the \"classical'' Schrödinger equation","url":"https://arxiv.org/abs/2312.02977","links":"https://arxiv.org/abs/2312.02977","create_time":"12/7/2023, 2:44:58 AM","update_time":"12/7/2023, 2:56:47 AM","version":"0.6.0","saveThreshold":10,"quitWaitTime":60,"environment":1,"maximizeWindow":0,"maxViewLength":15,"recordLog":1,"outputFormat":"xlsx","saveName":"current_time","inputExcel":"","startFromExit":0,"pauseKey":"p","containJudge":false,"desc":"https://arxiv.org/abs/2312.02977","inputParameters":[{"id":0,"name":"urlList_0","nodeId":1,"nodeName":"打开网页","value":"https://arxiv.org/abs/2312.02977","desc":"要采集的网址列表,多行以\\n分开","type":"text","exampleValue":"https://arxiv.org/abs/2312.02977"}],"outputParameters":[],"graph":[{"index":0,"id":0,"parentId":0,"type":-1,"option":0,"title":"root","sequence":[1,2],"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0},"isInLoop":false},{"id":1,"index":1,"parentId":0,"type":0,"option":1,"title":"打开网页","sequence":[],"isInLoop":false,"position":0,"parameters":{"useLoop":false,"xpath":"","wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"url":"https://arxiv.org/abs/2312.02977","links":"https://arxiv.org/abs/2312.02977","maxWaitTime":10,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"cookies":""}},{"id":2,"index":2,"parentId":0,"type":0,"option":2,"title":"点击Download PDF","sequence":[],"isInLoop":false,"position":1,"parameters":{"history":4,"tabIndex":-1,"useLoop":false,"xpath":"//*[contains(@class, \"download-pdf\")]","iframe":false,"wait":2,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"clickWay":0,"maxWaitTime":10,"params":[],"alertHandleType":0,"allXPaths":["/html/body/div[2]/main[1]/div[1]/div[1]/div[2]/div[1]/ul[1]/li[1]/a[1]","//a[contains(., 'Download P')]","//A[@class='abs-button download-pdf']","/html/body/div[last()-3]/main/div/div/div[last()-2]/div[last()-5]/ul/li[last()-2]/a"]}}]}
|
||||
{"id":228,"name":"[2312.02977] Exploring the nonclassical dynamics of the \"classical'' Schrödinger equation","url":"https://arxiv.org/abs/2312.02977","links":"https://arxiv.org/abs/2312.02977","create_time":"12/7/2023, 2:44:58 AM","update_time":"2024-01-05 22:08:46","version":"0.6.0","saveThreshold":10,"quitWaitTime":3,"environment":1,"maximizeWindow":0,"maxViewLength":15,"recordLog":1,"outputFormat":"xlsx","saveName":"TTT","dataWriteMode":3,"inputExcel":"","startFromExit":0,"pauseKey":"p","containJudge":false,"browser":"chrome","removeDuplicate":0,"desc":"https://arxiv.org/abs/2312.02977","inputParameters":[{"id":0,"name":"urlList_0","nodeId":1,"nodeName":"打开网页","value":"https://arxiv.org/abs/2312.02977","desc":"要采集的网址列表,多行以\\n分开","type":"text","exampleValue":"https://arxiv.org/abs/2312.02977"},{"id":1,"name":"loopTimes_1","nodeId":5,"nodeName":"循环 - 单个元素","desc":"循环循环 - 单个元素执行的次数(0代表无限循环)","type":"int","exampleValue":10,"value":10}],"outputParameters":[],"graph":[{"index":0,"id":0,"parentId":0,"type":-1,"option":0,"title":"root","sequence":[1,5],"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0},"isInLoop":false},{"id":1,"index":1,"parentId":0,"type":0,"option":1,"title":"打开网页","sequence":[],"isInLoop":false,"position":0,"parameters":{"useLoop":false,"xpath":"","wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"url":"https://arxiv.org/abs/2312.02977","links":"https://arxiv.org/abs/2312.02977","maxWaitTime":10,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"cookies":""}},{"id":3,"index":2,"parentId":2,"type":0,"option":2,"title":"点击Download PDF","sequence":[],"isInLoop":true,"position":0,"parameters":{"history":4,"tabIndex":-1,"useLoop":false,"xpath":"//*[contains(@class, \"download-pdf\")]","iframe":false,"wait":2,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"clickWay":0,"maxWaitTime":10,"params":[],"alertHandleType":0,"allXPaths":["/html/body/div[2]/main[1]/div[1]/div[1]/div[2]/div[1]/ul[1]/li[1]/a[1]","//a[contains(., 'Download P')]","//A[@class='abs-button download-pdf']","/html/body/div[last()-3]/main/div/div/div[last()-2]/div[last()-5]/ul/li[last()-2]/a"]}},{"id":-1,"index":3,"parentId":0,"type":0,"option":2,"title":"点击Download PDF","sequence":[],"isInLoop":false,"position":2,"parameters":{"history":4,"tabIndex":-1,"useLoop":false,"xpath":"//*[contains(@class, \"download-pdf\")]","iframe":false,"wait":2,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"clickWay":0,"maxWaitTime":10,"params":[],"alertHandleType":0,"allXPaths":["/html/body/div[2]/main[1]/div[1]/div[1]/div[2]/div[1]/ul[1]/li[1]/a[1]","//a[contains(., 'Download P')]","//A[@class='abs-button download-pdf']","/html/body/div[last()-3]/main/div/div/div[last()-2]/div[last()-5]/ul/li[last()-2]/a"]}},{"id":-1,"index":4,"parentId":0,"type":0,"option":2,"title":"点击Download PDF","sequence":[],"isInLoop":false,"position":3,"parameters":{"history":4,"tabIndex":-1,"useLoop":false,"xpath":"//*[contains(@class, \"download-pdf\")]","iframe":false,"wait":2,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"clickWay":0,"maxWaitTime":10,"params":[],"alertHandleType":0,"allXPaths":["/html/body/div[2]/main[1]/div[1]/div[1]/div[2]/div[1]/ul[1]/li[1]/a[1]","//a[contains(., 'Download P')]","//A[@class='abs-button download-pdf']","/html/body/div[last()-3]/main/div/div/div[last()-2]/div[last()-5]/ul/li[last()-2]/a"]}},{"id":2,"index":5,"parentId":0,"type":1,"option":8,"title":"循环 - 单个元素","sequence":[2],"isInLoop":false,"position":1,"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"//body","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"loopType":0,"pathList":"","textList":"","code":"","waitTime":0,"exitCount":10,"exitElement":"//body","historyWait":2,"breakMode":0,"breakCode":"","breakCodeWaitTime":0,"skipCount":0}}]}
|
@ -1 +1 @@
|
||||
{"id":229,"name":"知乎 - 有问题,就会有答案","url":"https://www.zhihu.com","links":"https://www.zhihu.com","create_time":"07/12/2023, 03:26:24","update_time":"07/12/2023, 03:43:34","version":"0.6.0","saveThreshold":10,"quitWaitTime":6,"environment":0,"maximizeWindow":0,"maxViewLength":15,"recordLog":1,"outputFormat":"xlsx","saveName":"current_time","inputExcel":"","startFromExit":0,"pauseKey":"t","containJudge":false,"desc":"https://www.zhihu.com","inputParameters":[{"id":0,"name":"urlList_0","nodeId":1,"nodeName":"打开网页","value":"https://www.zhihu.com","desc":"要采集的网址列表,多行以\\n分开","type":"text","exampleValue":"https://www.zhihu.com"}],"outputParameters":[{"id":0,"name":"参数1_文本","desc":"","type":"text","recordASField":1,"exampleValue":"死刑执行前可以谎称肚子痛,想排泄粪便,籍此拖延时间吗?"}],"graph":[{"index":0,"id":0,"parentId":0,"type":-1,"option":0,"title":"root","sequence":[1,2],"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0},"isInLoop":false},{"id":1,"index":1,"parentId":0,"type":0,"option":1,"title":"打开网页","sequence":[],"isInLoop":false,"position":0,"parameters":{"useLoop":false,"xpath":"","wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"url":"https://www.zhihu.com","links":"https://www.zhihu.com","maxWaitTime":10,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"cookies":""}},{"id":2,"index":2,"parentId":0,"type":1,"option":8,"title":"循环采集数据","sequence":[3],"isInLoop":false,"position":1,"parameters":{"history":5,"tabIndex":-1,"useLoop":false,"xpath":"/html/body/div[1]/div[1]/main[1]/div[1]/div[2]/div[1]/div[1]/div[2]/div[1]/div[1]/div[1]/div/div[1]/div[1]/div[1]/div[1]/h2[1]/div[1]","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"loopType":1,"pathList":"","textList":"","code":"","waitTime":0,"exitCount":0,"exitElement":"//body","historyWait":2,"breakMode":0,"breakCode":"","breakCodeWaitTime":0,"allXPaths":["/html/body/div[1]/div[1]/main[1]/div[1]/div[2]/div[1]/div[1]/div[2]/div[1]/div[1]/div[1]/div[2]/div[1]/div[1]/div[1]/div[1]/h2[1]/div[1]","//div[contains(., '死刑执行前可以谎称肚')]","/html/body/div[last()-7]/div/main/div/div/div[last()-1]/div/div/div/div/div/div[last()-12]/div/div/div/div/h2/div"]}},{"id":3,"index":3,"parentId":2,"type":0,"option":3,"title":"提取数据","sequence":[],"isInLoop":true,"position":0,"parameters":{"history":5,"tabIndex":-1,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"clear":0,"newLine":1,"params":[{"nodeType":0,"contentType":0,"relative":true,"name":"参数1_文本","desc":"","extractType":0,"relativeXPath":"","allXPaths":"","exampleValues":[{"num":0,"value":"死刑执行前可以谎称肚子痛,想排泄粪便,籍此拖延时间吗?"}],"unique_index":"onlvi030w9jlpu5tjzb","iframe":false,"default":"","paraType":"text","recordASField":1,"beforeJS":"","beforeJSWaitTime":0,"JS":"","JSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"downloadPic":0}],"loopType":1}}]}
|
||||
{"id":229,"name":"知乎 - 有问题,就会有答案","url":"https://www.zhihu.com","links":"https://www.zhihu.com","create_time":"07/12/2023, 03:26:24","update_time":"2023-12-27 20:05:50","version":"0.6.0","saveThreshold":10,"quitWaitTime":6,"environment":0,"maximizeWindow":0,"maxViewLength":15,"recordLog":1,"outputFormat":"xlsx","saveName":"current_time","dataWriteMode":1,"inputExcel":"","startFromExit":0,"pauseKey":"t","containJudge":false,"browser":"chrome","removeDuplicate":0,"desc":"知了个乎","inputParameters":[{"id":0,"name":"urlList_0","nodeId":1,"nodeName":"打开网页","value":"https://www.zhihu.com","desc":"要采集的网址列表,多行以\\n分开","type":"text","exampleValue":"https://www.zhihu.com"},{"id":1,"name":"loopTimes_1","nodeId":4,"nodeName":"循环 - 单个元素","desc":"循环循环 - 单个元素执行的次数(0代表无限循环)","type":"int","exampleValue":0,"value":0}],"outputParameters":[{"id":0,"name":"参数1_文本","desc":"","type":"text","recordASField":1,"exampleValue":"死刑执行前可以谎称肚子痛,想排泄粪便,籍此拖延时间吗?"}],"graph":[{"index":0,"id":0,"parentId":0,"type":-1,"option":0,"title":"root","sequence":[1,4,2],"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0},"isInLoop":false},{"id":1,"index":1,"parentId":0,"type":0,"option":1,"title":"打开网页","sequence":[],"isInLoop":false,"position":0,"parameters":{"useLoop":false,"xpath":"","wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"url":"https://www.zhihu.com","links":"https://www.zhihu.com","maxWaitTime":10,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"cookies":""}},{"id":3,"index":2,"parentId":0,"type":1,"option":8,"title":"循环采集数据","sequence":[3],"isInLoop":false,"position":2,"parameters":{"history":5,"tabIndex":-1,"useLoop":false,"xpath":"/html/body/div[1]/div[1]/main[1]/div[1]/div[2]/div[1]/div[1]/div[2]/div[1]/div[1]/div[1]/div/div[1]/div[1]/div[1]/div[1]/h2[1]/div[1]","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"loopType":1,"pathList":"","textList":"","code":"","waitTime":0,"exitCount":0,"exitElement":"//body","historyWait":2,"breakMode":0,"breakCode":"","breakCodeWaitTime":0,"allXPaths":["/html/body/div[1]/div[1]/main[1]/div[1]/div[2]/div[1]/div[1]/div[2]/div[1]/div[1]/div[1]/div[2]/div[1]/div[1]/div[1]/div[1]/h2[1]/div[1]","//div[contains(., '死刑执行前可以谎称肚')]","/html/body/div[last()-7]/div/main/div/div/div[last()-1]/div/div/div/div/div/div[last()-12]/div/div/div/div/h2/div"]}},{"id":4,"index":3,"parentId":3,"type":0,"option":3,"title":"提取数据","sequence":[],"isInLoop":true,"position":0,"parameters":{"history":5,"tabIndex":-1,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"clear":0,"newLine":1,"params":[{"nodeType":0,"contentType":0,"relative":true,"name":"参数1_文本","desc":"","extractType":0,"relativeXPath":"","allXPaths":"","exampleValues":[{"num":0,"value":"死刑执行前可以谎称肚子痛,想排泄粪便,籍此拖延时间吗?"}],"unique_index":"onlvi030w9jlpu5tjzb","iframe":false,"default":"","paraType":"text","recordASField":1,"beforeJS":"","beforeJSWaitTime":0,"JS":"","JSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"downloadPic":0}],"loopType":1}},{"id":2,"index":4,"parentId":0,"type":1,"option":8,"title":"循环 - 单个元素","sequence":[],"isInLoop":false,"position":1,"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"loopType":0,"pathList":"","textList":"","code":"","waitTime":0,"exitCount":0,"exitElement":"//body","historyWait":2,"breakMode":0,"breakCode":"","breakCodeWaitTime":0,"skipCount":0}}]}
|
@ -1 +1 @@
|
||||
{"id":70,"name":"京东全球版-专业的综合网上购物商城","url":"https://www.jd.com","links":"https://www.jd.com","create_time":"5/24/2023, 8:21:45 PM","version":"0.3.1","containJudge":false,"desc":"https://www.jd.com","inputParameters":[{"id":0,"name":"urlList_0","nodeId":1,"nodeName":"打开网页","value":"https://www.jd.com","desc":"要采集的网址列表,多行以\\n分开","type":"string","exampleValue":"https://www.jd.com"}],"outputParameters":[],"graph":[{"index":0,"id":0,"parentId":0,"type":-1,"option":0,"title":"root","sequence":[1,2],"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"","wait":0},"isInLoop":false},{"id":1,"index":1,"parentId":0,"type":0,"option":1,"title":"打开网页","sequence":[],"isInLoop":false,"position":0,"parameters":{"useLoop":false,"xpath":"","wait":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"url":"https://www.jd.com","links":"https://www.jd.com","maxWaitTime":10,"scrollType":0,"scrollCount":1,"scrollWaitTime":1}},{"id":2,"index":2,"parentId":0,"type":1,"option":8,"title":"循环","sequence":[3],"isInLoop":false,"position":1,"parameters":{"history":4,"tabIndex":-1,"useLoop":false,"xpath":"/html/body/div[5]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div","wait":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"loopType":1,"pathList":"","textList":"","code":"","waitTime":0,"exitCount":0,"historyWait":2,"breakMode":0,"breakCode":"","breakCodeWaitTime":0,"allXPaths":""}},{"id":3,"index":3,"parentId":2,"type":0,"option":7,"title":"移动到元素","sequence":[],"isInLoop":true,"position":0,"parameters":{"history":4,"tabIndex":-1,"useLoop":true,"xpath":"/html/body/div[5]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div","wait":2,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"allXPaths":"","loopType":1}}]}
|
||||
{"id":-2,"name":"京东全球版-专业的综合网上购物商城","url":"https://www.jd.com","links":"https://www.jd.com","create_time":"5/24/2023, 8:21:45 PM","version":"0.3.1","containJudge":false,"desc":"https://www.jd.com","inputParameters":[{"id":0,"name":"urlList_0","nodeId":1,"nodeName":"打开网页","value":"https://www.jd.com","desc":"要采集的网址列表,多行以\\n分开","type":"string","exampleValue":"https://www.jd.com"}],"outputParameters":[],"graph":[{"index":0,"id":0,"parentId":0,"type":-1,"option":0,"title":"root","sequence":[1,2],"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"","wait":0},"isInLoop":false},{"id":1,"index":1,"parentId":0,"type":0,"option":1,"title":"打开网页","sequence":[],"isInLoop":false,"position":0,"parameters":{"useLoop":false,"xpath":"","wait":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"url":"https://www.jd.com","links":"https://www.jd.com","maxWaitTime":10,"scrollType":0,"scrollCount":1,"scrollWaitTime":1}},{"id":2,"index":2,"parentId":0,"type":1,"option":8,"title":"循环","sequence":[3],"isInLoop":false,"position":1,"parameters":{"history":4,"tabIndex":-1,"useLoop":false,"xpath":"/html/body/div[5]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div","wait":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"loopType":1,"pathList":"","textList":"","code":"","waitTime":0,"exitCount":0,"historyWait":2,"breakMode":0,"breakCode":"","breakCodeWaitTime":0,"allXPaths":""}},{"id":3,"index":3,"parentId":2,"type":0,"option":7,"title":"移动到元素","sequence":[],"isInLoop":true,"position":0,"parameters":{"history":4,"tabIndex":-1,"useLoop":true,"xpath":"/html/body/div[5]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div","wait":2,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"allXPaths":"","loopType":1}}]}
|
File diff suppressed because one or more lines are too long
@ -1,4 +1,4 @@
|
||||
欢迎将软件宣传给更多需要的朋友!
|
||||
欢迎将软件宣传给更多需要的朋友和Star我们的Github仓库!
|
||||
|
||||
在此文件夹下打开Linux Terimal, 并输入以下命令运行软件:
|
||||
./easy-spider.sh
|
||||
@ -8,99 +8,10 @@
|
||||
|
||||
支持Ubuntu 20.04, Debian, Deepin x64及以上版本。
|
||||
|
||||
软件开源代码Github库地址:https://github.com/NaiboWang/EasySpider
|
||||
|
||||
官方文档地址:https://github.com/NaiboWang/EasySpider/wiki
|
||||
|
||||
视频教程:https://www.bilibili.com/video/BV1th411A7ey/
|
||||
|
||||
可以从其他机器导入任务,只需要把其他机器的tasks文件夹里的.json文件放入此目录的tasks文件夹里即可。同理执行号文件可以通过复制execution_instances文件夹中的.json文件来导入。注意,两个文件夹里的.json文件只支持命名为大于0的数字。
|
||||
|
||||
|
||||
======版本更新说明======
|
||||
|
||||
v0.3.2以上版本更新说明请查看Github Release Pages页面:https://github.com/NaiboWang/EasySpider/releases
|
||||
|
||||
-----v0.3.2-----
|
||||
|
||||
## 更新说明
|
||||
|
||||
1. 选中子元素操作可删除字段并在浏览器中实时取消标记被删除的字段。
|
||||
<img src="https://github.com/NaiboWang/EasySpider/assets/30287768/e016c832-6ff9-4814-b86c-38787e73aa30" width=50% />
|
||||
|
||||
2. 选中子元素增加选择模式,可以只选择所有块都有的子元素,或者所有块中和第一个选中的块相同的子元素。
|
||||
<img src="https://github.com/NaiboWang/EasySpider/assets/30287768/0082b11d-96bc-43f1-acdb-8280decb48b4" width=50% />
|
||||
|
||||
3. 输入文字和打开网页选项中可以使用最后一次提取到的字段值**作为变量**进行文字输入,用`Field["字段名"]`表示此变量。
|
||||

|
||||
|
||||
4. 可下载文件,如PDF。
|
||||
5. 修复打开后有可能会白屏10秒左右的Bug,使得在内网,暗网以及任意局域网都可以使用软件。
|
||||
6. 修复提取当前页面URL和标题时可能提取不到的bug。
|
||||
7. 修复OCR识别可能提取不到的bug。
|
||||
8. 提取逻辑更新为每采集10条本地保存一次。
|
||||
9. 修改任务时默认锚点位置为任务流程的最后操作后。
|
||||
10. 更新Chrome版本为114。
|
||||
|
||||
-----V0.3.1-----
|
||||
|
||||
如果下载速度慢,可以考虑中国境内下载地址:[中国境内下载地址](https://github.com/NaiboWang/EasySpider/releases/download/v0.3.0/Download_Link_Address_in_China_Mainland.txt)。
|
||||
|
||||
### 强烈建议大家观看新特性讲解视频
|
||||
|
||||
B站最新版特性视频已上传,新视频非常有用,推荐大家观看。
|
||||
|
||||
[【重要】自定义条件判断之使用循环项内的JS命令返回值 - 第二弹](https://www.bilibili.com/video/BV1mu411x7Nn/)
|
||||
|
||||
[如何同时执行多个任务(并行多开)](https://www.bilibili.com/video/BV13c411G7LE/)
|
||||
|
||||
[如何执行自己写的JS代码和系统代码 (自定义操作)](https://www.bilibili.com/video/BV1qs4y1z7Hc/)
|
||||
|
||||
[如何自定义循环和判断条件 - 第一弹](https://www.bilibili.com/video/BV1Ys4y1z777/)
|
||||
|
||||
[如何对元素和网页截图及(无头模式)命令行执行指南](https://www.bilibili.com/video/BV1dV4y1z764/)
|
||||
|
||||
[OCR识别元素内容功能](https://www.bilibili.com/video/BV1xz4y1b72D/)
|
||||
|
||||
注意,v0.3.1版本任务task文件夹内`.json`文件和之前所有版本均不兼容,请重新设计v0.3.1版本任务。
|
||||
|
||||
## 更新说明
|
||||
1. 自定义操作:
|
||||
- 可以在任务流程中**执行自定义脚本**,包括在浏览器中**执行Javascript指令**以及**操作系统级别的脚本调用**并可**得到命令返回值并记录**,大大扩展了可操作空间。
|
||||
|
||||

|
||||
|
||||
- 在每一个操作执行前和执行后,都可以指定执行一段针对当前定位元素的JavaScript指令。
|
||||
|
||||
<img src="https://github.com/NaiboWang/EasySpider/assets/30287768/dde64388-5668-40ff-951e-fb8f60655c49" height=50% width=50%>
|
||||
|
||||
2. **判断条件和循环条件**中同样增加了**执行自定义脚本**,并根据自定义脚本的返回值是否为真来作为条件判断和循环的判断条件,同样极大的增加了任务的可操作性。循环中增加了用代码break的操作设定,自定义操作可以操作循环内元素。
|
||||

|
||||
<img src="https://github.com/NaiboWang/EasySpider/assets/30287768/5ce7cf50-e5c9-4714-a83b-9c65934e9c68" width=50%></img>
|
||||
|
||||
3. 可同时生成多种XPath供用户选择,并**预装了XPath Helper扩展**供大家调试XPath。
|
||||
4. 增加采集元素背景图片地址,当前页面标题,当前页面URL地址功能。
|
||||
5. 增加保存元素截图功能,如要截图某元素或整个网页页面,可以用此功能(配合无头模式效果更好)。
|
||||
6. 增加下载图片功能。
|
||||
7. 增加OCR识别元素功能(使用此功能需首先自行安装Tesseract库:[https://blog.csdn.net/u010454030/article/details/80515501](https://blog.csdn.net/u010454030/article/details/80515501))
|
||||
|
||||
8. 可直接提取对元素执行JavaScript代码后的返回值,实现如正则表达式,获得元素背景颜色等功能。
|
||||
9. 增加切换下拉选项功能,采集下拉选项正在选中的值和文本。
|
||||
<img src="https://github.com/NaiboWang/EasySpider/assets/30287768/c0b2bec1-2a97-4516-930e-1b310697212b" width=50%></img>
|
||||
|
||||

|
||||
|
||||
10. 大幅增加使用提示和说明,使软件更易用(如增加了iframe标签的处理方式说明,各个选项的参数意义,以及循环项XPath的修改说明等等)。
|
||||
11. 执行命令时增加了如何用命令行执行任务的提示:[https://github.com/NaiboWang/EasySpider/wiki/Argument-Instruction](https://github.com/NaiboWang/EasySpider/wiki/Argument-Instruction)。
|
||||

|
||||
12. 增加并行多开模式。
|
||||
13. 增加无头模式,即无浏览器界面模式配置。
|
||||
14. 修复了使用用户配置浏览器模式下的中文路径不能正确识别的问题。
|
||||
15. 修复了条件分支没有无条件分支时会卡死的问题。
|
||||
16. 修复了保存任务后会输入框卡死的问题。
|
||||
17. 打开网页操作和点击元素操作新增设置页面最长加载等待时间。
|
||||
18. 增加了鼠标移动到元素功能。
|
||||
19. 找不到元素时会提示。
|
||||
20. 修复网页滚动Bug。
|
||||
21. 增加新增提取数据字段操作。
|
||||
22. 任务名称初始化为第一次进入页面的标题值。
|
||||
23. 增加版本更新提示。
|
||||
24. 应要求增加出品方信息。
|
||||
25. 更新chrome版本为113。
|
||||
|
||||
|
@ -1,8 +1,29 @@
|
||||
Due to the complex security settings of MacOS, the issue of being unable to open software due to the "unverified developer" message may occur upon the first attempt to open the software. Please refer to the following GitHub document to see how to open software and perform tasks on your MacOS version:
|
||||
Due to MacOS's complex security settings, software downloaded for the first time will warn that the developer is unverified and will not allow the application to run. Please follow these steps to unlock:
|
||||
|
||||
https://github.com/NaiboWang/EasySpider/wiki/MacOS-Guide
|
||||
1. Open the system Terminal.
|
||||
|
||||
The main steps are as follows:
|
||||
2. Navigate to the EasySpider software directory, such as:
|
||||
|
||||
cd ~/Downloads/EasySpider_MacOS
|
||||
|
||||
3. In the EasySpider directory, run the `first_time_run.sh` script to modify the package properties by using the following command:
|
||||
|
||||
bash first_time_run.sh
|
||||
|
||||
|
||||
|
||||
This will unlock EasySpider for both design and execution stages.
|
||||
|
||||
If you encounter errors such as the one below during the command execution, they can be ignored, and you may proceed to open the software after the command completes:
|
||||
|
||||
xattr: [Errno 13] Permission denied: 'EasySpider.app/Contents/Resources/app/node_modules/node-window-manager/build/node_gyp_bins/python3'
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
For another solution, refer to this video on how to open software and execute tasks in MacOS version: https://www.bilibili.com/video/BV1E34y137fT/
|
||||
|
||||
- Design phase - Apple Arm chip version of MacOS
|
||||
|
||||
|
@ -4,6 +4,6 @@ There is a potential issue with the software for MacOS, in that the Chrome softw
|
||||
|
||||
To check the Chrome version, enter the EasySpider software and right-click to "Show Package Contents". Then go to Contents/Resources/app folder and double-click on the chrome_mac64 software to open Chrome. Then go to Settings -> About to check if the Chrome version matches the version of chromedriver_mac64 when you open it manually.
|
||||
|
||||
If it is not, you can download the corresponding macOS version of Chromedriver for your current Chrome version from the following website: https://chromedriver.chromium.org/downloads, and then place the downloaded Chromedriver in the Contents/Resources/app folder mentioned above, rename it and replace the "chromedriver_mac64" file to restore normal use of the software.
|
||||
If it is not, you can download the corresponding macOS version of Chromedriver for your current Chrome version (just check at the main version number before the first decimal point, such as 122) from the following website: https://chromedriver.chromium.org/downloads, and then place the downloaded Chromedriver in the Contents/Resources/app folder mentioned above, rename it and replace the "chromedriver_mac64" file to restore normal use of the software.
|
||||
|
||||
|
||||
|
File diff suppressed because one or more lines are too long
@ -1 +1 @@
|
||||
{"id":228,"name":"[2312.02977] Exploring the nonclassical dynamics of the \"classical'' Schrödinger equation","url":"https://arxiv.org/abs/2312.02977","links":"https://arxiv.org/abs/2312.02977","create_time":"12/7/2023, 2:44:58 AM","update_time":"12/7/2023, 2:56:47 AM","version":"0.6.0","saveThreshold":10,"quitWaitTime":60,"environment":1,"maximizeWindow":0,"maxViewLength":15,"recordLog":1,"outputFormat":"xlsx","saveName":"current_time","inputExcel":"","startFromExit":0,"pauseKey":"p","containJudge":false,"desc":"https://arxiv.org/abs/2312.02977","inputParameters":[{"id":0,"name":"urlList_0","nodeId":1,"nodeName":"打开网页","value":"https://arxiv.org/abs/2312.02977","desc":"要采集的网址列表,多行以\\n分开","type":"text","exampleValue":"https://arxiv.org/abs/2312.02977"}],"outputParameters":[],"graph":[{"index":0,"id":0,"parentId":0,"type":-1,"option":0,"title":"root","sequence":[1,2],"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0},"isInLoop":false},{"id":1,"index":1,"parentId":0,"type":0,"option":1,"title":"打开网页","sequence":[],"isInLoop":false,"position":0,"parameters":{"useLoop":false,"xpath":"","wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"url":"https://arxiv.org/abs/2312.02977","links":"https://arxiv.org/abs/2312.02977","maxWaitTime":10,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"cookies":""}},{"id":2,"index":2,"parentId":0,"type":0,"option":2,"title":"点击Download PDF","sequence":[],"isInLoop":false,"position":1,"parameters":{"history":4,"tabIndex":-1,"useLoop":false,"xpath":"//*[contains(@class, \"download-pdf\")]","iframe":false,"wait":2,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"clickWay":0,"maxWaitTime":10,"params":[],"alertHandleType":0,"allXPaths":["/html/body/div[2]/main[1]/div[1]/div[1]/div[2]/div[1]/ul[1]/li[1]/a[1]","//a[contains(., 'Download P')]","//A[@class='abs-button download-pdf']","/html/body/div[last()-3]/main/div/div/div[last()-2]/div[last()-5]/ul/li[last()-2]/a"]}}]}
|
||||
{"id":228,"name":"[2312.02977] Exploring the nonclassical dynamics of the \"classical'' Schrödinger equation","url":"https://arxiv.org/abs/2312.02977","links":"https://arxiv.org/abs/2312.02977","create_time":"12/7/2023, 2:44:58 AM","update_time":"2024-01-05 22:08:46","version":"0.6.0","saveThreshold":10,"quitWaitTime":3,"environment":1,"maximizeWindow":0,"maxViewLength":15,"recordLog":1,"outputFormat":"xlsx","saveName":"TTT","dataWriteMode":3,"inputExcel":"","startFromExit":0,"pauseKey":"p","containJudge":false,"browser":"chrome","removeDuplicate":0,"desc":"https://arxiv.org/abs/2312.02977","inputParameters":[{"id":0,"name":"urlList_0","nodeId":1,"nodeName":"打开网页","value":"https://arxiv.org/abs/2312.02977","desc":"要采集的网址列表,多行以\\n分开","type":"text","exampleValue":"https://arxiv.org/abs/2312.02977"},{"id":1,"name":"loopTimes_1","nodeId":5,"nodeName":"循环 - 单个元素","desc":"循环循环 - 单个元素执行的次数(0代表无限循环)","type":"int","exampleValue":10,"value":10}],"outputParameters":[],"graph":[{"index":0,"id":0,"parentId":0,"type":-1,"option":0,"title":"root","sequence":[1,5],"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0},"isInLoop":false},{"id":1,"index":1,"parentId":0,"type":0,"option":1,"title":"打开网页","sequence":[],"isInLoop":false,"position":0,"parameters":{"useLoop":false,"xpath":"","wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"url":"https://arxiv.org/abs/2312.02977","links":"https://arxiv.org/abs/2312.02977","maxWaitTime":10,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"cookies":""}},{"id":3,"index":2,"parentId":2,"type":0,"option":2,"title":"点击Download PDF","sequence":[],"isInLoop":true,"position":0,"parameters":{"history":4,"tabIndex":-1,"useLoop":false,"xpath":"//*[contains(@class, \"download-pdf\")]","iframe":false,"wait":2,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"clickWay":0,"maxWaitTime":10,"params":[],"alertHandleType":0,"allXPaths":["/html/body/div[2]/main[1]/div[1]/div[1]/div[2]/div[1]/ul[1]/li[1]/a[1]","//a[contains(., 'Download P')]","//A[@class='abs-button download-pdf']","/html/body/div[last()-3]/main/div/div/div[last()-2]/div[last()-5]/ul/li[last()-2]/a"]}},{"id":-1,"index":3,"parentId":0,"type":0,"option":2,"title":"点击Download PDF","sequence":[],"isInLoop":false,"position":2,"parameters":{"history":4,"tabIndex":-1,"useLoop":false,"xpath":"//*[contains(@class, \"download-pdf\")]","iframe":false,"wait":2,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"clickWay":0,"maxWaitTime":10,"params":[],"alertHandleType":0,"allXPaths":["/html/body/div[2]/main[1]/div[1]/div[1]/div[2]/div[1]/ul[1]/li[1]/a[1]","//a[contains(., 'Download P')]","//A[@class='abs-button download-pdf']","/html/body/div[last()-3]/main/div/div/div[last()-2]/div[last()-5]/ul/li[last()-2]/a"]}},{"id":-1,"index":4,"parentId":0,"type":0,"option":2,"title":"点击Download PDF","sequence":[],"isInLoop":false,"position":3,"parameters":{"history":4,"tabIndex":-1,"useLoop":false,"xpath":"//*[contains(@class, \"download-pdf\")]","iframe":false,"wait":2,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"clickWay":0,"maxWaitTime":10,"params":[],"alertHandleType":0,"allXPaths":["/html/body/div[2]/main[1]/div[1]/div[1]/div[2]/div[1]/ul[1]/li[1]/a[1]","//a[contains(., 'Download P')]","//A[@class='abs-button download-pdf']","/html/body/div[last()-3]/main/div/div/div[last()-2]/div[last()-5]/ul/li[last()-2]/a"]}},{"id":2,"index":5,"parentId":0,"type":1,"option":8,"title":"循环 - 单个元素","sequence":[2],"isInLoop":false,"position":1,"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"//body","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"loopType":0,"pathList":"","textList":"","code":"","waitTime":0,"exitCount":10,"exitElement":"//body","historyWait":2,"breakMode":0,"breakCode":"","breakCodeWaitTime":0,"skipCount":0}}]}
|
@ -1 +1 @@
|
||||
{"id":229,"name":"知乎 - 有问题,就会有答案","url":"https://www.zhihu.com","links":"https://www.zhihu.com","create_time":"07/12/2023, 03:26:24","update_time":"07/12/2023, 03:43:34","version":"0.6.0","saveThreshold":10,"quitWaitTime":6,"environment":0,"maximizeWindow":0,"maxViewLength":15,"recordLog":1,"outputFormat":"xlsx","saveName":"current_time","inputExcel":"","startFromExit":0,"pauseKey":"t","containJudge":false,"desc":"https://www.zhihu.com","inputParameters":[{"id":0,"name":"urlList_0","nodeId":1,"nodeName":"打开网页","value":"https://www.zhihu.com","desc":"要采集的网址列表,多行以\\n分开","type":"text","exampleValue":"https://www.zhihu.com"}],"outputParameters":[{"id":0,"name":"参数1_文本","desc":"","type":"text","recordASField":1,"exampleValue":"死刑执行前可以谎称肚子痛,想排泄粪便,籍此拖延时间吗?"}],"graph":[{"index":0,"id":0,"parentId":0,"type":-1,"option":0,"title":"root","sequence":[1,2],"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0},"isInLoop":false},{"id":1,"index":1,"parentId":0,"type":0,"option":1,"title":"打开网页","sequence":[],"isInLoop":false,"position":0,"parameters":{"useLoop":false,"xpath":"","wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"url":"https://www.zhihu.com","links":"https://www.zhihu.com","maxWaitTime":10,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"cookies":""}},{"id":2,"index":2,"parentId":0,"type":1,"option":8,"title":"循环采集数据","sequence":[3],"isInLoop":false,"position":1,"parameters":{"history":5,"tabIndex":-1,"useLoop":false,"xpath":"/html/body/div[1]/div[1]/main[1]/div[1]/div[2]/div[1]/div[1]/div[2]/div[1]/div[1]/div[1]/div/div[1]/div[1]/div[1]/div[1]/h2[1]/div[1]","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"loopType":1,"pathList":"","textList":"","code":"","waitTime":0,"exitCount":0,"exitElement":"//body","historyWait":2,"breakMode":0,"breakCode":"","breakCodeWaitTime":0,"allXPaths":["/html/body/div[1]/div[1]/main[1]/div[1]/div[2]/div[1]/div[1]/div[2]/div[1]/div[1]/div[1]/div[2]/div[1]/div[1]/div[1]/div[1]/h2[1]/div[1]","//div[contains(., '死刑执行前可以谎称肚')]","/html/body/div[last()-7]/div/main/div/div/div[last()-1]/div/div/div/div/div/div[last()-12]/div/div/div/div/h2/div"]}},{"id":3,"index":3,"parentId":2,"type":0,"option":3,"title":"提取数据","sequence":[],"isInLoop":true,"position":0,"parameters":{"history":5,"tabIndex":-1,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"clear":0,"newLine":1,"params":[{"nodeType":0,"contentType":0,"relative":true,"name":"参数1_文本","desc":"","extractType":0,"relativeXPath":"","allXPaths":"","exampleValues":[{"num":0,"value":"死刑执行前可以谎称肚子痛,想排泄粪便,籍此拖延时间吗?"}],"unique_index":"onlvi030w9jlpu5tjzb","iframe":false,"default":"","paraType":"text","recordASField":1,"beforeJS":"","beforeJSWaitTime":0,"JS":"","JSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"downloadPic":0}],"loopType":1}}]}
|
||||
{"id":229,"name":"知乎 - 有问题,就会有答案","url":"https://www.zhihu.com","links":"https://www.zhihu.com","create_time":"07/12/2023, 03:26:24","update_time":"2023-12-27 20:05:50","version":"0.6.0","saveThreshold":10,"quitWaitTime":6,"environment":0,"maximizeWindow":0,"maxViewLength":15,"recordLog":1,"outputFormat":"xlsx","saveName":"current_time","dataWriteMode":1,"inputExcel":"","startFromExit":0,"pauseKey":"t","containJudge":false,"browser":"chrome","removeDuplicate":0,"desc":"知了个乎","inputParameters":[{"id":0,"name":"urlList_0","nodeId":1,"nodeName":"打开网页","value":"https://www.zhihu.com","desc":"要采集的网址列表,多行以\\n分开","type":"text","exampleValue":"https://www.zhihu.com"},{"id":1,"name":"loopTimes_1","nodeId":4,"nodeName":"循环 - 单个元素","desc":"循环循环 - 单个元素执行的次数(0代表无限循环)","type":"int","exampleValue":0,"value":0}],"outputParameters":[{"id":0,"name":"参数1_文本","desc":"","type":"text","recordASField":1,"exampleValue":"死刑执行前可以谎称肚子痛,想排泄粪便,籍此拖延时间吗?"}],"graph":[{"index":0,"id":0,"parentId":0,"type":-1,"option":0,"title":"root","sequence":[1,4,2],"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0},"isInLoop":false},{"id":1,"index":1,"parentId":0,"type":0,"option":1,"title":"打开网页","sequence":[],"isInLoop":false,"position":0,"parameters":{"useLoop":false,"xpath":"","wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"url":"https://www.zhihu.com","links":"https://www.zhihu.com","maxWaitTime":10,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"cookies":""}},{"id":3,"index":2,"parentId":0,"type":1,"option":8,"title":"循环采集数据","sequence":[3],"isInLoop":false,"position":2,"parameters":{"history":5,"tabIndex":-1,"useLoop":false,"xpath":"/html/body/div[1]/div[1]/main[1]/div[1]/div[2]/div[1]/div[1]/div[2]/div[1]/div[1]/div[1]/div/div[1]/div[1]/div[1]/div[1]/h2[1]/div[1]","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"loopType":1,"pathList":"","textList":"","code":"","waitTime":0,"exitCount":0,"exitElement":"//body","historyWait":2,"breakMode":0,"breakCode":"","breakCodeWaitTime":0,"allXPaths":["/html/body/div[1]/div[1]/main[1]/div[1]/div[2]/div[1]/div[1]/div[2]/div[1]/div[1]/div[1]/div[2]/div[1]/div[1]/div[1]/div[1]/h2[1]/div[1]","//div[contains(., '死刑执行前可以谎称肚')]","/html/body/div[last()-7]/div/main/div/div/div[last()-1]/div/div/div/div/div/div[last()-12]/div/div/div/div/h2/div"]}},{"id":4,"index":3,"parentId":3,"type":0,"option":3,"title":"提取数据","sequence":[],"isInLoop":true,"position":0,"parameters":{"history":5,"tabIndex":-1,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"clear":0,"newLine":1,"params":[{"nodeType":0,"contentType":0,"relative":true,"name":"参数1_文本","desc":"","extractType":0,"relativeXPath":"","allXPaths":"","exampleValues":[{"num":0,"value":"死刑执行前可以谎称肚子痛,想排泄粪便,籍此拖延时间吗?"}],"unique_index":"onlvi030w9jlpu5tjzb","iframe":false,"default":"","paraType":"text","recordASField":1,"beforeJS":"","beforeJSWaitTime":0,"JS":"","JSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"downloadPic":0}],"loopType":1}},{"id":2,"index":4,"parentId":0,"type":1,"option":8,"title":"循环 - 单个元素","sequence":[],"isInLoop":false,"position":1,"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"loopType":0,"pathList":"","textList":"","code":"","waitTime":0,"exitCount":0,"exitElement":"//body","historyWait":2,"breakMode":0,"breakCode":"","breakCodeWaitTime":0,"skipCount":0}}]}
|
File diff suppressed because one or more lines are too long
1
.temp_to_pub/EasySpider_MacOS/Sample Tasks/309.json
Normal file
1
.temp_to_pub/EasySpider_MacOS/Sample Tasks/309.json
Normal file
@ -0,0 +1 @@
|
||||
{"id":309,"name":"京东全球版-专业的综合网上购物商城","url":"https://www.jd.com","links":"https://www.jd.com","create_time":"2023-12-24 00:34:50","update_time":"2023-12-24 00:36:58","version":"0.6.0","saveThreshold":10,"quitWaitTime":60,"environment":0,"maximizeWindow":0,"maxViewLength":15,"recordLog":1,"outputFormat":"xlsx","saveName":"current_time","dataWriteMode":1,"inputExcel":"","startFromExit":0,"pauseKey":"p","containJudge":false,"browser":"chrome","removeDuplicate":1,"desc":"https://www.jd.com","inputParameters":[{"id":0,"name":"urlList_0","nodeId":1,"nodeName":"打开网页","value":"https://www.jd.com","desc":"要采集的网址列表,多行以\\n分开","type":"text","exampleValue":"https://www.jd.com"},{"id":1,"name":"inputText_1","nodeName":"输入文字","nodeId":2,"desc":"要输入的文本,如京东搜索框输入:电脑","type":"text","exampleValue":"JS(\"return new Date().getYear()\")1","value":"JS(\"return new Date().getYear()\")1"}],"outputParameters":[{"id":0,"name":"参数1_链接文本","desc":"","type":"text","recordASField":1,"exampleValue":"手机"},{"id":1,"name":"参数2_链接地址","desc":"","type":"text","recordASField":1,"exampleValue":"https://shouji.jd.com/"}],"graph":[{"index":0,"id":0,"parentId":0,"type":-1,"option":0,"title":"root","sequence":[1,2,3],"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0},"isInLoop":false},{"id":1,"index":1,"parentId":0,"type":0,"option":1,"title":"打开网页","sequence":[],"isInLoop":false,"position":0,"parameters":{"useLoop":false,"xpath":"","wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"url":"https://www.jd.com","links":"https://www.jd.com","maxWaitTime":10,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"cookies":""}},{"id":2,"index":2,"parentId":0,"type":0,"option":4,"title":"输入文字","sequence":[],"isInLoop":false,"position":1,"parameters":{"history":4,"tabIndex":-1,"useLoop":false,"xpath":"//*[@id=\"key\"]","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"value":"JS(\"return new Date().getYear()\")1","index":0,"allXPaths":["/html/body/div[4]/div[1]/div[2]/div[1]/input[1]","//input[contains(., '')]","id(\"key\")","//INPUT[@class='text']","/html/body/div[last()-6]/div/div[last()-2]/div/input"]}},{"id":3,"index":3,"parentId":0,"type":1,"option":8,"title":"循环采集数据","sequence":[4],"isInLoop":false,"position":2,"parameters":{"history":4,"tabIndex":-1,"useLoop":false,"xpath":"/html/body/div[5]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div/a","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"loopType":1,"pathList":"","code":"","waitTime":0,"exitCount":0,"exitElement":"//body","historyWait":2,"breakMode":0,"breakCode":"","breakCodeWaitTime":0,"skipCount":0,"allXPaths":["/html/body/div[5]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/a[1]","//a[contains(., '手机')]","/html/body/div[last()-5]/div/div[last()-4]/div/div[last()-2]/div/div/div/div[last()-1]/div[last()-12]/a[last()-1]"]}},{"id":4,"index":4,"parentId":3,"type":0,"option":3,"title":"提取数据","sequence":[],"isInLoop":true,"position":0,"parameters":{"history":4,"tabIndex":-1,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"clear":0,"newLine":1,"params":[{"nodeType":1,"contentType":8,"relative":true,"name":"参数1_链接文本","desc":"","extractType":0,"relativeXPath":"","allXPaths":"","exampleValues":[{"num":0,"value":"手机"}],"unique_index":"ughtq41gxwnlqia7awp","iframe":false,"default":"","paraType":"text","recordASField":1,"beforeJS":"","beforeJSWaitTime":0,"JS":"","JSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"downloadPic":0,"splitLine":0},{"nodeType":2,"contentType":0,"relative":true,"name":"参数2_链接地址","desc":"","relativeXPath":"","allXPaths":"","exampleValues":[{"num":0,"value":"https://shouji.jd.com/"}],"unique_index":"ughtq41gxwnlqia7awp","iframe":false,"default":"","paraType":"text","recordASField":1,"beforeJS":"","beforeJSWaitTime":0,"JS":"","JSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"downloadPic":0,"splitLine":0}]}}]}
|
1
.temp_to_pub/EasySpider_MacOS/Sample Tasks/310.json
Normal file
1
.temp_to_pub/EasySpider_MacOS/Sample Tasks/310.json
Normal file
File diff suppressed because one or more lines are too long
1
.temp_to_pub/EasySpider_MacOS/Sample Tasks/311.json
Normal file
1
.temp_to_pub/EasySpider_MacOS/Sample Tasks/311.json
Normal file
@ -0,0 +1 @@
|
||||
{"id":311,"name":"重命名测试","url":"https://www.jd.com","links":"https://www.jd.com","create_time":"2023-12-28 14:05:20","update_time":"2023-12-28 14:05:43","version":"0.6.0","saveThreshold":10,"quitWaitTime":60,"environment":0,"maximizeWindow":0,"maxViewLength":15,"recordLog":1,"outputFormat":"xlsx","saveName":"TTT","dataWriteMode":3,"inputExcel":"","startFromExit":0,"pauseKey":"p","containJudge":false,"browser":"chrome","removeDuplicate":0,"desc":"https://www.jd.com","inputParameters":[{"id":0,"name":"urlList_0","nodeId":1,"nodeName":"打开网页","value":"https://www.jd.com","desc":"要采集的网址列表,多行以\\n分开","type":"text","exampleValue":"https://www.jd.com"}],"outputParameters":[{"id":0,"name":"参数1_链接文本","desc":"","type":"text","recordASField":1,"exampleValue":"手机"},{"id":1,"name":"参数2_链接地址","desc":"","type":"text","recordASField":1,"exampleValue":"https://shouji.jd.com/"}],"graph":[{"index":0,"id":0,"parentId":0,"type":-1,"option":0,"title":"root","sequence":[1,2],"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0},"isInLoop":false},{"id":1,"index":1,"parentId":0,"type":0,"option":1,"title":"打开网页","sequence":[],"isInLoop":false,"position":0,"parameters":{"useLoop":false,"xpath":"","wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"url":"https://www.jd.com","links":"https://www.jd.com","maxWaitTime":10,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"cookies":""}},{"id":2,"index":2,"parentId":0,"type":1,"option":8,"title":"循环采集数据","sequence":[3],"isInLoop":false,"position":1,"parameters":{"history":4,"tabIndex":-1,"useLoop":false,"xpath":"/html/body/div[5]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div/a","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"loopType":1,"pathList":"","code":"","waitTime":0,"exitCount":0,"exitElement":"//body","historyWait":2,"breakMode":0,"breakCode":"","breakCodeWaitTime":0,"skipCount":0,"allXPaths":["/html/body/div[5]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/a[1]","//a[contains(., '手机')]","/html/body/div[last()-5]/div/div[last()-4]/div/div[last()-2]/div/div/div/div[last()-1]/div[last()-12]/a[last()-1]"]}},{"id":3,"index":3,"parentId":2,"type":0,"option":3,"title":"提取数据","sequence":[],"isInLoop":true,"position":0,"parameters":{"history":4,"tabIndex":-1,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"clear":0,"newLine":1,"params":[{"nodeType":1,"contentType":0,"relative":true,"name":"参数1_链接文本","desc":"","extractType":0,"relativeXPath":"","allXPaths":"","exampleValues":[{"num":0,"value":"手机"}],"unique_index":"zvn77ulso2lqoswqo4","iframe":false,"default":"","paraType":"text","recordASField":1,"beforeJS":"","beforeJSWaitTime":0,"JS":"","JSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"downloadPic":0,"splitLine":0},{"nodeType":2,"contentType":0,"relative":true,"name":"参数2_链接地址","desc":"","relativeXPath":"","allXPaths":"","exampleValues":[{"num":0,"value":"https://shouji.jd.com/"}],"unique_index":"zvn77ulso2lqoswqo4","iframe":false,"default":"","paraType":"text","recordASField":1,"beforeJS":"","beforeJSWaitTime":0,"JS":"","JSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"downloadPic":0,"splitLine":0}]}}]}
|
1
.temp_to_pub/EasySpider_MacOS/Sample Tasks/313.json
Normal file
1
.temp_to_pub/EasySpider_MacOS/Sample Tasks/313.json
Normal file
File diff suppressed because one or more lines are too long
1
.temp_to_pub/EasySpider_MacOS/Sample Tasks/314.json
Normal file
1
.temp_to_pub/EasySpider_MacOS/Sample Tasks/314.json
Normal file
File diff suppressed because one or more lines are too long
1
.temp_to_pub/EasySpider_MacOS/Sample Tasks/315.json
Normal file
1
.temp_to_pub/EasySpider_MacOS/Sample Tasks/315.json
Normal file
@ -0,0 +1 @@
|
||||
{"id":315,"name":"京东全球版-专业的综合网上购物商城","url":"https://www.jd.com","links":"https://www.jd.com","create_time":"2023-12-29 22:34:23","update_time":"2023-12-29 22:38:36","version":"0.6.0","saveThreshold":10,"quitWaitTime":60,"environment":0,"maximizeWindow":0,"maxViewLength":15,"recordLog":1,"outputFormat":"xlsx","saveName":"current_time","dataWriteMode":1,"inputExcel":"","startFromExit":0,"pauseKey":"p","containJudge":false,"browser":"chrome","removeDuplicate":0,"desc":"https://www.jd.com","inputParameters":[{"id":0,"name":"urlList_0","nodeId":1,"nodeName":"打开网页","value":"https://www.jd.com","desc":"要采集的网址列表,多行以\\n分开","type":"text","exampleValue":"https://www.jd.com"}],"outputParameters":[{"id":0,"name":"Text","desc":"自定义操作返回的数据","type":"text","recordASField":1,"exampleValue":""},{"id":1,"name":"Link","desc":"自定义操作返回的数据","type":"text","recordASField":1,"exampleValue":""}],"graph":[{"index":0,"id":0,"parentId":0,"type":-1,"option":0,"title":"root","sequence":[1,2],"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0},"isInLoop":false},{"id":1,"index":1,"parentId":0,"type":0,"option":1,"title":"打开网页","sequence":[],"isInLoop":false,"position":0,"parameters":{"useLoop":false,"xpath":"","wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"url":"https://www.jd.com","links":"https://www.jd.com","maxWaitTime":10,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"cookies":""}},{"id":2,"index":2,"parentId":0,"type":1,"option":8,"title":"循环点击每个元素","sequence":[4,5,3],"isInLoop":false,"position":1,"parameters":{"history":4,"tabIndex":-1,"useLoop":false,"xpath":"/html/body/div[5]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div/a","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"loopType":1,"pathList":"","code":"","waitTime":0,"exitCount":0,"exitElement":"//body","historyWait":2,"breakMode":0,"breakCode":"","breakCodeWaitTime":0,"skipCount":0,"allXPaths":""}},{"id":5,"index":3,"parentId":2,"type":0,"option":2,"title":"点击元素","sequence":[],"isInLoop":true,"position":2,"parameters":{"history":4,"tabIndex":-1,"useLoop":true,"xpath":"","iframe":false,"wait":2,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"clickWay":0,"newTab":1,"maxWaitTime":10,"params":[],"alertHandleType":0,"allXPaths":""}},{"id":3,"index":4,"parentId":2,"type":0,"option":5,"title":"Text","sequence":[],"isInLoop":true,"position":0,"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"clear":0,"newLine":0,"codeMode":2,"code":"return arguments[0].innerText","waitTime":0,"recordASField":1,"paraType":"text","emailConfig":{"host":"","port":465,"username":"","password":"","from":"","to":"","subject":"","content":""}}},{"id":4,"index":5,"parentId":2,"type":0,"option":5,"title":"Link","sequence":[],"isInLoop":true,"position":1,"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"clear":0,"newLine":1,"codeMode":2,"code":"return arguments[0].href","waitTime":0,"recordASField":1,"paraType":"text","emailConfig":{"host":"","port":465,"username":"","password":"","from":"","to":"","subject":"","content":""}}}]}
|
1
.temp_to_pub/EasySpider_MacOS/Sample Tasks/316.json
Normal file
1
.temp_to_pub/EasySpider_MacOS/Sample Tasks/316.json
Normal file
@ -0,0 +1 @@
|
||||
{"id":316,"name":"京东全球版-专业的综合网上购物商城","url":"https://www.jd.com","links":"https://www.jd.com","create_time":"2023-12-30 22:35:04","update_time":"2023-12-30 22:35:12","version":"0.6.0","saveThreshold":10,"quitWaitTime":60,"environment":0,"maximizeWindow":0,"maxViewLength":15,"recordLog":1,"outputFormat":"csv","saveName":"current_time","dataWriteMode":1,"inputExcel":"","startFromExit":0,"pauseKey":"p","containJudge":false,"browser":"chrome","removeDuplicate":0,"desc":"https://www.jd.com","inputParameters":[{"id":0,"name":"urlList_0","nodeId":1,"nodeName":"打开网页","value":"https://www.jd.com","desc":"要采集的网址列表,多行以\\n分开","type":"text","exampleValue":"https://www.jd.com"}],"outputParameters":[{"id":0,"name":"自定义操作","desc":"自定义操作返回的数据","type":"text","recordASField":0,"exampleValue":""}],"graph":[{"index":0,"id":0,"parentId":0,"type":-1,"option":0,"title":"root","sequence":[1,2],"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0},"isInLoop":false},{"id":1,"index":1,"parentId":0,"type":0,"option":1,"title":"打开网页","sequence":[],"isInLoop":false,"position":0,"parameters":{"useLoop":false,"xpath":"","wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"url":"https://www.jd.com","links":"https://www.jd.com","maxWaitTime":10,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"cookies":""}},{"id":2,"index":2,"parentId":0,"type":0,"option":5,"title":"自定义操作","sequence":[],"isInLoop":false,"position":1,"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"clear":0,"newLine":1,"codeMode":12,"code":"","waitTime":0,"recordASField":0,"paraType":"text","emailConfig":{"host":"","port":465,"username":"","password":"","from":"","to":"","subject":"","content":""}}}]}
|
1
.temp_to_pub/EasySpider_MacOS/Sample Tasks/317.json
Normal file
1
.temp_to_pub/EasySpider_MacOS/Sample Tasks/317.json
Normal file
@ -0,0 +1 @@
|
||||
{"id":317,"name":"图片下载","url":"https://www.jd.com","links":"https://www.jd.com","create_time":"2024-01-05 22:14:43","update_time":"2024-01-05 22:15:19","version":"0.6.0","saveThreshold":10,"quitWaitTime":60,"environment":0,"maximizeWindow":0,"maxViewLength":15,"recordLog":1,"outputFormat":"csv","saveName":"current_time","dataWriteMode":1,"inputExcel":"","startFromExit":0,"pauseKey":"p","containJudge":false,"browser":"chrome","removeDuplicate":0,"desc":"https://www.jd.com","inputParameters":[{"id":0,"name":"urlList_0","nodeId":1,"nodeName":"打开网页","value":"https://www.jd.com","desc":"要采集的网址列表,多行以\\n分开","type":"text","exampleValue":"https://www.jd.com"}],"outputParameters":[{"id":0,"name":"参数2_图片地址","desc":"","type":"text","recordASField":1,"exampleValue":"//m.360buyimg.com/babel/jfs/t1/232616/15/5744/219106/656d810aF16705ea9/41c4997dc1b81f17.png"}],"graph":[{"index":0,"id":0,"parentId":0,"type":-1,"option":0,"title":"root","sequence":[1,3],"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0},"isInLoop":false},{"id":1,"index":1,"parentId":0,"type":0,"option":1,"title":"打开网页","sequence":[],"isInLoop":false,"position":0,"parameters":{"useLoop":false,"xpath":"","wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"url":"https://www.jd.com","links":"https://www.jd.com","maxWaitTime":10,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"cookies":""}},{"id":-1,"index":2,"parentId":0,"type":0,"option":3,"title":"提取数据","sequence":[],"isInLoop":false,"position":1,"parameters":{"history":4,"tabIndex":-1,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"clear":0,"newLine":1,"params":[{"nodeType":4,"contentType":0,"relative":false,"name":"参数1_图片地址","desc":"","extractType":0,"relativeXPath":"/html/body/div[5]/div[1]/div[1]/div[1]/div[2]/div[1]/div[1]/div[1]/div[2]/div[1]/div[2]/div[1]/div[1]/a[1]/img[1]","allXPaths":["/html/body/div[5]/div[1]/div[1]/div[1]/div[2]/div[1]/div[1]/div[1]/div[2]/div[1]/div[2]/div[1]/div[1]/a[1]/img[1]","//img[contains(., '')]","/html/body/div[last()-6]/div/div[last()-4]/div/div[last()-1]/div/div[last()-1]/div/div[last()-1]/div/div[last()-3]/div/div/a/img"],"exampleValues":[{"num":0,"value":"//m.360buyimg.com/babel/s1420x740_jfs/t1/194401/20/32669/76553/64142a96F7733e6ad/cf2727848c86cf45.jpg!q70.dpg"}],"unique_index":"i9in42ta6klr0pwp4k","iframe":false,"default":"","paraType":"text","recordASField":1,"beforeJS":"","beforeJSWaitTime":0,"JS":"","JSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"downloadPic":0,"splitLine":0}]}},{"id":2,"index":3,"parentId":0,"type":1,"option":8,"title":"循环采集数据","sequence":[4],"isInLoop":false,"position":1,"parameters":{"history":4,"tabIndex":-1,"useLoop":false,"xpath":"/html/body/div[5]/div[1]/div[1]/div[1]/div[2]/div[1]/div[1]/div[1]/div[2]/div[1]/div/div[1]/div[1]/a[1]/img[1]","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"loopType":1,"pathList":"","code":"","waitTime":0,"exitCount":0,"exitElement":"//body","historyWait":2,"breakMode":0,"breakCode":"","breakCodeWaitTime":0,"skipCount":0,"allXPaths":["/html/body/div[5]/div[1]/div[1]/div[1]/div[2]/div[1]/div[1]/div[1]/div[2]/div[1]/div[1]/div[1]/div[1]/a[1]/img[1]","//img[contains(., '')]","/html/body/div[last()-5]/div/div[last()-4]/div/div[last()-1]/div/div[last()-1]/div/div[last()-1]/div/div[last()-4]/div/div/a/img"]}},{"id":3,"index":4,"parentId":2,"type":0,"option":3,"title":"提取数据","sequence":[],"isInLoop":true,"position":0,"parameters":{"history":4,"tabIndex":-1,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"clear":0,"newLine":1,"params":[{"nodeType":4,"contentType":0,"relative":true,"name":"参数2_图片地址","desc":"","extractType":0,"relativeXPath":"","allXPaths":"","exampleValues":[{"num":0,"value":"//m.360buyimg.com/babel/jfs/t1/232616/15/5744/219106/656d810aF16705ea9/41c4997dc1b81f17.png"}],"unique_index":"i81avec75qflr0pwym8","iframe":false,"default":"","paraType":"text","recordASField":1,"beforeJS":"","beforeJSWaitTime":0,"JS":"","JSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"downloadPic":1,"splitLine":0}]}}]}
|
1
.temp_to_pub/EasySpider_MacOS/Sample Tasks/318.json
Normal file
1
.temp_to_pub/EasySpider_MacOS/Sample Tasks/318.json
Normal file
@ -0,0 +1 @@
|
||||
{"id":318,"name":"京东(JD.COM)-正品低价、品质保障、配送及时、轻松购物!","url":"https://www.jd.com","links":"https://www.jd.com","create_time":"2024-04-22 05:08:03","update_time":"2024-04-22 05:19:48","version":"0.6.2","saveThreshold":10,"quitWaitTime":60,"environment":0,"maximizeWindow":0,"maxViewLength":15,"recordLog":1,"outputFormat":"csv","saveName":"current_time","dataWriteMode":1,"inputExcel":"","startFromExit":0,"pauseKey":"p","containJudge":false,"browser":"chrome","removeDuplicate":0,"desc":"https://www.jd.com","inputParameters":[{"id":0,"name":"urlList_0","nodeId":1,"nodeName":"打开网页","value":"https://www.jd.com","desc":"要采集的网址列表,多行以\\n分开","type":"text","exampleValue":"https://www.jd.com"}],"outputParameters":[{"id":0,"name":"参数1_链接文本","desc":"","type":"text","recordASField":1,"exampleValue":"电脑数码"},{"id":1,"name":"参数2_链接地址","desc":"","type":"text","recordASField":1,"exampleValue":"https://prodev.jd.com/mall/active/31XPWPTonxJ9e5YoQ85HS7z8XNYQ/index.html?babelChannel=ttt40"}],"graph":[{"index":0,"id":0,"parentId":0,"type":-1,"option":0,"title":"root","sequence":[1,2],"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0},"isInLoop":false},{"id":1,"index":1,"parentId":0,"type":0,"option":1,"title":"打开网页","sequence":[],"isInLoop":false,"position":0,"parameters":{"useLoop":false,"xpath":"","wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"url":"https://www.jd.com","links":"https://www.jd.com","maxWaitTime":10,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"cookies":""}},{"id":2,"index":2,"parentId":0,"type":1,"option":8,"title":"循环采集数据","sequence":[3],"isInLoop":false,"position":1,"parameters":{"history":4,"tabIndex":-1,"useLoop":false,"xpath":"/html/body/div[1]/div[4]/div[1]/div[4]/ul[1]/li/a[1]","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"loopType":1,"pathList":"","code":"","waitTime":0,"exitCount":0,"exitElement":"//body","historyWait":2,"breakMode":0,"breakCode":"","breakCodeWaitTime":0,"skipCount":0,"allXPaths":["/html/body/div[1]/div[4]/div[1]/div[4]/ul[1]/li[1]/a[1]","//a[contains(., '电脑数码')]","//A[@class='navitems-lk']","/html/body/div[last()-5]/div[last()-2]/div/div[last()-1]/ul/li[last()-8]/a"]}},{"id":3,"index":3,"parentId":2,"type":0,"option":3,"title":"提取数据","sequence":[],"isInLoop":true,"position":0,"parameters":{"history":4,"tabIndex":-1,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"clear":0,"newLine":1,"params":[{"nodeType":1,"contentType":15,"relative":true,"name":"参数1_链接文本","desc":"","extractType":0,"relativeXPath":"","allXPaths":"","exampleValues":[{"num":0,"value":"电脑数码"}],"unique_index":"auwkv5g1krqlva0tsc4","iframe":false,"default":"","paraType":"text","recordASField":1,"beforeJS":"","beforeJSWaitTime":0,"JS":"123","JSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"downloadPic":0,"splitLine":0},{"nodeType":2,"contentType":0,"relative":true,"name":"参数2_链接地址","desc":"","relativeXPath":"","allXPaths":"","exampleValues":[{"num":0,"value":"https://prodev.jd.com/mall/active/31XPWPTonxJ9e5YoQ85HS7z8XNYQ/index.html?babelChannel=ttt40"}],"unique_index":"auwkv5g1krqlva0tsc4","iframe":false,"default":"","paraType":"text","recordASField":1,"beforeJS":"","beforeJSWaitTime":0,"JS":"","JSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"downloadPic":0,"splitLine":0}]}}]}
|
1
.temp_to_pub/EasySpider_MacOS/Sample Tasks/319.json
Normal file
1
.temp_to_pub/EasySpider_MacOS/Sample Tasks/319.json
Normal file
@ -0,0 +1 @@
|
||||
{"id":-2,"name":"百度一下,你就知道","url":"https://www.baidu.com?id=1","links":"https://www.baidu.com?id=11\nhttps://www.baidu.com?id=12","create_time":"2024-04-22 05:45:12","update_time":"2024-04-22 05:45:20","version":"0.6.2","saveThreshold":10,"quitWaitTime":60,"environment":0,"maximizeWindow":0,"maxViewLength":15,"recordLog":1,"outputFormat":"csv","saveName":"current_time","dataWriteMode":1,"inputExcel":"","startFromExit":0,"pauseKey":"p","containJudge":false,"browser":"chrome","removeDuplicate":0,"desc":"https://www.baidu.com?id=1","inputParameters":[{"id":0,"name":"urlList_0","nodeId":1,"nodeName":"打开网页","value":"https://www.baidu.com?id=11\nhttps://www.baidu.com?id=12","desc":"要采集的网址列表,多行以\\n分开","type":"text","exampleValue":"https://www.baidu.com?id=11\nhttps://www.baidu.com?id=12"}],"outputParameters":[],"graph":[{"index":0,"id":0,"parentId":0,"type":-1,"option":0,"title":"root","sequence":[1],"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0},"isInLoop":false},{"id":1,"index":1,"parentId":0,"type":0,"option":1,"title":"打开网页","sequence":[],"isInLoop":false,"position":0,"parameters":{"useLoop":false,"xpath":"","wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"url":"https://www.baidu.com?id=1","links":"https://www.baidu.com?id=11\nhttps://www.baidu.com?id=12","maxWaitTime":10,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"cookies":""}}]}
|
1
.temp_to_pub/EasySpider_MacOS/Sample Tasks/320.json
Normal file
1
.temp_to_pub/EasySpider_MacOS/Sample Tasks/320.json
Normal file
@ -0,0 +1 @@
|
||||
{"id":320,"name":"百度一下,你就知道","url":"https://www.baidu.com","links":"https://www.baidu.com","create_time":"2024-04-22 05:53:18","update_time":"2024-04-22 05:53:28","version":"0.6.2","saveThreshold":10,"quitWaitTime":60,"environment":0,"maximizeWindow":0,"maxViewLength":15,"recordLog":1,"outputFormat":"csv","saveName":"current_time","dataWriteMode":1,"inputExcel":"","startFromExit":0,"pauseKey":"p","containJudge":false,"browser":"chrome","removeDuplicate":0,"desc":"https://www.baidu.com","inputParameters":[{"id":0,"name":"urlList_0","nodeId":1,"nodeName":"打开网页","value":"https://www.baidu.com","desc":"要采集的网址列表,多行以\\n分开","type":"text","exampleValue":"https://www.baidu.com"}],"outputParameters":[],"graph":[{"index":0,"id":0,"parentId":0,"type":-1,"option":0,"title":"root","sequence":[1,2],"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0},"isInLoop":false},{"id":1,"index":1,"parentId":0,"type":0,"option":1,"title":"打开网页","sequence":[],"isInLoop":false,"position":0,"parameters":{"useLoop":false,"xpath":"","wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"url":"https://www.baidu.com","links":"https://www.baidu.com","maxWaitTime":10,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"cookies":""}},{"id":2,"index":2,"parentId":0,"type":1,"option":8,"title":"循环点击每个元素","sequence":[3],"isInLoop":false,"position":1,"parameters":{"history":4,"tabIndex":-1,"useLoop":false,"xpath":"/html/body/div[2]/div[1]/div[5]/div[1]/div[1]/div[3]/ul[1]/li/a[1]/span[2]","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"loopType":1,"pathList":"","code":"","waitTime":0,"exitCount":0,"exitElement":"//body","historyWait":2,"breakMode":0,"breakCode":"","breakCodeWaitTime":0,"skipCount":0,"allXPaths":""}},{"id":3,"index":3,"parentId":2,"type":0,"option":2,"title":"点击元素","sequence":[],"isInLoop":true,"position":0,"parameters":{"history":4,"tabIndex":-1,"useLoop":true,"xpath":"","iframe":false,"wait":2,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"clickWay":0,"newTab":1,"maxWaitTime":10,"params":[],"alertHandleType":0,"downloadWaitTime":3600,"allXPaths":""}}]}
|
1
.temp_to_pub/EasySpider_MacOS/Sample Tasks/321.json
Normal file
1
.temp_to_pub/EasySpider_MacOS/Sample Tasks/321.json
Normal file
@ -0,0 +1 @@
|
||||
{"id":321,"name":"百度一下,你就知道","url":"https://www.baidu.com","links":"https://www.baidu.com","create_time":"2024-04-22 07:02:02","update_time":"2024-04-22 07:02:16","version":"0.6.2","saveThreshold":10,"quitWaitTime":60,"environment":0,"maximizeWindow":0,"maxViewLength":15,"recordLog":1,"outputFormat":"csv","saveName":"current_time","dataWriteMode":1,"inputExcel":"","startFromExit":0,"pauseKey":"p","containJudge":false,"browser":"chrome","removeDuplicate":0,"desc":"https://www.baidu.com","inputParameters":[{"id":0,"name":"urlList_0","nodeId":1,"nodeName":"打开网页","value":"https://www.baidu.com","desc":"要采集的网址列表,多行以\\n分开","type":"text","exampleValue":"https://www.baidu.com"}],"outputParameters":[],"graph":[{"index":0,"id":0,"parentId":0,"type":-1,"option":0,"title":"root","sequence":[1,2],"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0},"isInLoop":false},{"id":1,"index":1,"parentId":0,"type":0,"option":1,"title":"打开网页","sequence":[],"isInLoop":false,"position":0,"parameters":{"useLoop":false,"xpath":"","wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"url":"https://www.baidu.com","links":"https://www.baidu.com","maxWaitTime":10,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"cookies":""}},{"id":2,"index":2,"parentId":0,"type":1,"option":8,"title":"循环点击每个元素","sequence":[3],"isInLoop":false,"position":1,"parameters":{"history":4,"tabIndex":-1,"useLoop":false,"xpath":"/html/body/div[2]/div[1]/div[5]/div[1]/div[1]/div[3]/ul[1]/li/a[1]/span[2]","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"loopType":1,"pathList":"","code":"","waitTime":0,"exitCount":0,"exitElement":"//body","historyWait":2,"breakMode":0,"breakCode":"","breakCodeWaitTime":0,"skipCount":0,"allXPaths":""}},{"id":3,"index":3,"parentId":2,"type":0,"option":2,"title":"点击元素","sequence":[],"isInLoop":true,"position":0,"parameters":{"history":4,"tabIndex":-1,"useLoop":true,"xpath":"","iframe":false,"wait":2,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"clickWay":0,"newTab":1,"maxWaitTime":10,"params":[],"alertHandleType":0,"downloadWaitTime":3600,"allXPaths":""}}]}
|
1
.temp_to_pub/EasySpider_MacOS/Sample Tasks/322.json
Normal file
1
.temp_to_pub/EasySpider_MacOS/Sample Tasks/322.json
Normal file
@ -0,0 +1 @@
|
||||
{"id":322,"name":"京东全球版-专业的综合网上购物商城","url":"https://www.jd.com","links":"https://www.jd.com","create_time":"2024-04-22 08:13:15","update_time":"2024-04-22 08:13:33","version":"0.6.2","saveThreshold":10,"quitWaitTime":60,"environment":0,"maximizeWindow":0,"maxViewLength":15,"recordLog":1,"outputFormat":"csv","saveName":"current_time","dataWriteMode":1,"inputExcel":"","startFromExit":0,"pauseKey":"p","containJudge":false,"browser":"chrome","removeDuplicate":0,"desc":"https://www.jd.com","inputParameters":[{"id":0,"name":"urlList_0","nodeId":1,"nodeName":"打开网页","value":"https://www.jd.com","desc":"要采集的网址列表,多行以\\n分开","type":"text","exampleValue":"https://www.jd.com"}],"outputParameters":[],"graph":[{"index":0,"id":0,"parentId":0,"type":-1,"option":0,"title":"root","sequence":[1,2],"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0},"isInLoop":false},{"id":1,"index":1,"parentId":0,"type":0,"option":1,"title":"打开网页","sequence":[],"isInLoop":false,"position":0,"parameters":{"useLoop":false,"xpath":"","wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"url":"https://www.jd.com","links":"https://www.jd.com","maxWaitTime":10,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"cookies":""}},{"id":2,"index":2,"parentId":0,"type":1,"option":8,"title":"循环点击每个元素","sequence":[3],"isInLoop":false,"position":1,"parameters":{"history":4,"tabIndex":-1,"useLoop":false,"xpath":"/html/body/div[5]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div/a","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"loopType":1,"pathList":"","code":"","waitTime":0,"exitCount":0,"exitElement":"//body","historyWait":2,"breakMode":0,"breakCode":"","breakCodeWaitTime":0,"skipCount":0,"allXPaths":""}},{"id":3,"index":3,"parentId":2,"type":0,"option":2,"title":"点击元素","sequence":[],"isInLoop":true,"position":0,"parameters":{"history":4,"tabIndex":-1,"useLoop":true,"xpath":"","iframe":false,"wait":2,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"clickWay":0,"newTab":1,"maxWaitTime":10,"params":[],"alertHandleType":0,"downloadWaitTime":3600,"allXPaths":""}}]}
|
1
.temp_to_pub/EasySpider_MacOS/Sample Tasks/323.json
Normal file
1
.temp_to_pub/EasySpider_MacOS/Sample Tasks/323.json
Normal file
@ -0,0 +1 @@
|
||||
{"id":323,"name":"新web采集任务","url":"https://www.baidu.com","links":"https://www.baidu.com","create_time":"","update_time":"2024-08-10 17:29:04","version":"0.6.2","saveThreshold":10,"quitWaitTime":60,"environment":0,"maximizeWindow":0,"maxViewLength":15,"recordLog":1,"outputFormat":"csv","saveName":"current_time","dataWriteMode":1,"inputExcel":"","startFromExit":0,"pauseKey":"p","containJudge":false,"browser":"chrome","removeDuplicate":0,"desc":"https://www.baidu.com","inputParameters":[{"id":0,"name":"urlList_0","nodeId":1,"nodeName":"打开网页","value":"https://www.baidu.com","desc":"要采集的网址列表,多行以\\n分开","type":"text","exampleValue":"https://www.baidu.com"}],"outputParameters":[],"graph":[{"index":0,"id":0,"parentId":0,"type":-1,"option":0,"title":"root","sequence":[1],"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0},"isInLoop":false},{"id":1,"index":1,"parentId":0,"type":0,"option":1,"title":"打开网页","sequence":[],"isInLoop":false,"position":0,"parameters":{"useLoop":false,"xpath":"","wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"url":"https://www.baidu.com","links":"https://www.baidu.com","maxWaitTime":10,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"cookies":""}}]}
|
1
.temp_to_pub/EasySpider_MacOS/Sample Tasks/324.json
Normal file
1
.temp_to_pub/EasySpider_MacOS/Sample Tasks/324.json
Normal file
File diff suppressed because one or more lines are too long
1
.temp_to_pub/EasySpider_MacOS/Sample Tasks/325.json
Normal file
1
.temp_to_pub/EasySpider_MacOS/Sample Tasks/325.json
Normal file
@ -0,0 +1 @@
|
||||
{"id":325,"name":"百度一下,你就知道","url":"https://www.baidu.com","links":"https://www.baidu.com","create_time":"2024-12-30 22:37:29","update_time":"2024-12-30 22:37:43","version":"0.6.3","saveThreshold":10,"quitWaitTime":60,"environment":0,"maximizeWindow":0,"maxViewLength":15,"recordLog":1,"outputFormat":"csv","saveName":"current_time","dataWriteMode":1,"inputExcel":"","startFromExit":0,"pauseKey":"p","containJudge":false,"browser":"chrome","removeDuplicate":0,"desc":"https://www.baidu.com","inputParameters":[{"id":0,"name":"urlList_0","nodeId":1,"nodeName":"打开网页","value":"https://www.baidu.com","desc":"要采集的网址列表,多行以\\n分开","type":"text","exampleValue":"https://www.baidu.com"}],"outputParameters":[{"id":0,"name":"参数1_链接文本","desc":"","type":"text","recordASField":1,"exampleValue":"0暖心2024 总书记的贴心话"},{"id":1,"name":"参数2_链接地址","desc":"","type":"text","recordASField":1,"exampleValue":"https://www.baidu.com/s?wd=%E6%9A%96%E5%BF%832024+%E6%80%BB%E4%B9%A6%E8%AE%B0%E7%9A%84%E8%B4%B4%E5%BF%83%E8%AF%9D&sa=fyb_n_homepage&rsv_dl=fyb_n_homepage&from=super&cl=3&tn=baidutop10&fr=top1000&rsv_idx=2&hisfilter=1"}],"graph":[{"index":0,"id":0,"parentId":0,"type":-1,"option":0,"title":"root","sequence":[1,2],"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0},"isInLoop":false},{"id":1,"index":1,"parentId":0,"type":0,"option":1,"title":"打开网页","sequence":[],"isInLoop":false,"position":0,"parameters":{"useLoop":false,"xpath":"","wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"url":"https://www.baidu.com","links":"https://www.baidu.com","maxWaitTime":10,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"cookies":""}},{"id":2,"index":2,"parentId":0,"type":1,"option":8,"title":"循环采集数据","sequence":[3],"isInLoop":false,"position":1,"parameters":{"history":4,"tabIndex":-1,"useLoop":false,"xpath":"/html/body/div[1]/div[1]/div[5]/div[1]/div[1]/div[3]/ul[1]/li/a[1]","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"loopType":1,"pathList":"","code":"","waitTime":0,"exitCount":0,"exitElement":"//body","historyWait":2,"breakMode":0,"breakCode":"","breakCodeWaitTime":0,"skipCount":0,"allXPaths":["/html/body/div[1]/div[1]/div[5]/div[1]/div[1]/div[3]/ul[1]/li[1]/a[1]","//a[contains(., '0暖心2024 总')]","//a[@class='title-content c-link c-font-medium c-line-clamp1']","/html/body/div[last()-4]/div[last()-3]/div[last()-3]/div/div/div/ul/li[last()-9]/a"]}},{"id":3,"index":3,"parentId":2,"type":0,"option":3,"title":"提取数据","sequence":[],"isInLoop":true,"position":0,"parameters":{"history":4,"tabIndex":-1,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"clear":0,"newLine":1,"params":[{"nodeType":1,"contentType":8,"relative":true,"name":"参数1_链接文本","desc":"","extractType":0,"relativeXPath":"","allXPaths":"","exampleValues":[{"num":0,"value":"0暖心2024 总书记的贴心话"}],"unique_index":"8rtq2is658sm5b58osr","iframe":false,"default":"","paraType":"text","recordASField":1,"beforeJS":"","beforeJSWaitTime":0,"JS":"","JSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"downloadPic":0,"splitLine":0},{"nodeType":2,"contentType":0,"relative":true,"name":"参数2_链接地址","desc":"","relativeXPath":"","allXPaths":"","exampleValues":[{"num":0,"value":"https://www.baidu.com/s?wd=%E6%9A%96%E5%BF%832024+%E6%80%BB%E4%B9%A6%E8%AE%B0%E7%9A%84%E8%B4%B4%E5%BF%83%E8%AF%9D&sa=fyb_n_homepage&rsv_dl=fyb_n_homepage&from=super&cl=3&tn=baidutop10&fr=top1000&rsv_idx=2&hisfilter=1"}],"unique_index":"8rtq2is658sm5b58osr","iframe":false,"default":"","paraType":"text","recordASField":1,"beforeJS":"","beforeJSWaitTime":0,"JS":"","JSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"downloadPic":0,"splitLine":0}]}}]}
|
@ -1 +1 @@
|
||||
{"id":70,"name":"京东全球版-专业的综合网上购物商城","url":"https://www.jd.com","links":"https://www.jd.com","create_time":"5/24/2023, 8:21:45 PM","version":"0.3.1","containJudge":false,"desc":"https://www.jd.com","inputParameters":[{"id":0,"name":"urlList_0","nodeId":1,"nodeName":"打开网页","value":"https://www.jd.com","desc":"要采集的网址列表,多行以\\n分开","type":"string","exampleValue":"https://www.jd.com"}],"outputParameters":[],"graph":[{"index":0,"id":0,"parentId":0,"type":-1,"option":0,"title":"root","sequence":[1,2],"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"","wait":0},"isInLoop":false},{"id":1,"index":1,"parentId":0,"type":0,"option":1,"title":"打开网页","sequence":[],"isInLoop":false,"position":0,"parameters":{"useLoop":false,"xpath":"","wait":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"url":"https://www.jd.com","links":"https://www.jd.com","maxWaitTime":10,"scrollType":0,"scrollCount":1,"scrollWaitTime":1}},{"id":2,"index":2,"parentId":0,"type":1,"option":8,"title":"循环","sequence":[3],"isInLoop":false,"position":1,"parameters":{"history":4,"tabIndex":-1,"useLoop":false,"xpath":"/html/body/div[5]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div","wait":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"loopType":1,"pathList":"","textList":"","code":"","waitTime":0,"exitCount":0,"historyWait":2,"breakMode":0,"breakCode":"","breakCodeWaitTime":0,"allXPaths":""}},{"id":3,"index":3,"parentId":2,"type":0,"option":7,"title":"移动到元素","sequence":[],"isInLoop":true,"position":0,"parameters":{"history":4,"tabIndex":-1,"useLoop":true,"xpath":"/html/body/div[5]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div","wait":2,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"allXPaths":"","loopType":1}}]}
|
||||
{"id":-2,"name":"京东全球版-专业的综合网上购物商城","url":"https://www.jd.com","links":"https://www.jd.com","create_time":"5/24/2023, 8:21:45 PM","version":"0.3.1","containJudge":false,"desc":"https://www.jd.com","inputParameters":[{"id":0,"name":"urlList_0","nodeId":1,"nodeName":"打开网页","value":"https://www.jd.com","desc":"要采集的网址列表,多行以\\n分开","type":"string","exampleValue":"https://www.jd.com"}],"outputParameters":[],"graph":[{"index":0,"id":0,"parentId":0,"type":-1,"option":0,"title":"root","sequence":[1,2],"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"","wait":0},"isInLoop":false},{"id":1,"index":1,"parentId":0,"type":0,"option":1,"title":"打开网页","sequence":[],"isInLoop":false,"position":0,"parameters":{"useLoop":false,"xpath":"","wait":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"url":"https://www.jd.com","links":"https://www.jd.com","maxWaitTime":10,"scrollType":0,"scrollCount":1,"scrollWaitTime":1}},{"id":2,"index":2,"parentId":0,"type":1,"option":8,"title":"循环","sequence":[3],"isInLoop":false,"position":1,"parameters":{"history":4,"tabIndex":-1,"useLoop":false,"xpath":"/html/body/div[5]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div","wait":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"loopType":1,"pathList":"","textList":"","code":"","waitTime":0,"exitCount":0,"historyWait":2,"breakMode":0,"breakCode":"","breakCodeWaitTime":0,"allXPaths":""}},{"id":3,"index":3,"parentId":2,"type":0,"option":7,"title":"移动到元素","sequence":[],"isInLoop":true,"position":0,"parameters":{"history":4,"tabIndex":-1,"useLoop":true,"xpath":"/html/body/div[5]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div","wait":2,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"allXPaths":"","loopType":1}}]}
|
File diff suppressed because one or more lines are too long
5
.temp_to_pub/EasySpider_MacOS/first_time_run.sh
Normal file
5
.temp_to_pub/EasySpider_MacOS/first_time_run.sh
Normal file
@ -0,0 +1,5 @@
|
||||
#!/bin/bash
|
||||
|
||||
xattr -cr EasySpider.app
|
||||
xattr -cr easyspider_executestage
|
||||
xattr -cr easyspider_executestage_full
|
@ -23,7 +23,7 @@ For more complex operations, please download the source code and compile it for
|
||||
"""
|
||||
|
||||
# 请在下面编写你的代码,不要有代码缩进!!! | Please write your code below, do not indent the code!!!
|
||||
|
||||
print(globals())
|
||||
# 导包 | Import packages
|
||||
from selenium.common.exceptions import ElementClickInterceptedException
|
||||
|
||||
@ -56,3 +56,20 @@ finally:
|
||||
print("All parameters:", self.outputParameters)
|
||||
print(test(3))
|
||||
print("执行完毕|Execution completed")
|
||||
|
||||
import time
|
||||
time.sleep(3)
|
||||
|
||||
def new_line(outputParameters, maxViewLength, record):
|
||||
line = []
|
||||
print("Use this function to print a new line in the console")
|
||||
i = 0
|
||||
for value in outputParameters.values():
|
||||
line.append(value)
|
||||
if record[i]:
|
||||
print(value[:maxViewLength], " ", end="")
|
||||
i += 1
|
||||
print("")
|
||||
return line
|
||||
|
||||
new_line(self.outputParameters, 10, [True, True, True, True, True, True, True, True, True, True, True, True, True, True, True])
|
@ -1,87 +1,22 @@
|
||||
Official Site: https://www.easyspider.net
|
||||
|
||||
Welcome to promote this software to other friends.
|
||||
Welcome to promote this software to other friends and star our Github Repository!
|
||||
|
||||
This version is for MacOS, can be used on all Chips, including Intel (such as Corel i7) and Arm (such as M1). Support on MacOS 11.x and above.
|
||||
|
||||
If your MacOS version is 10.x and below, please download EasySpider V0.2.0.
|
||||
|
||||
The software's open-source code repository on GitHub: https://github.com/NaiboWang/EasySpider
|
||||
|
||||
Official documentation can be found at: https://github.com/NaiboWang/EasySpider/wiki
|
||||
|
||||
Video Tutorial: https://youtube.com/playlist?list=PL0kEFEkWrT7mt9MUlEBV2DTo1QsaanUTp
|
||||
|
||||
You can import tasks from other machines by simply opening the EasySpider software in this directory, right-clicking "Show Package Contents", and then placing the .json files from the tasks folder in the /Users/your user name/Library/Application Support/EasySpider/tasks folder of the other machine. Similarly, execution ID files can be imported by copying the .json files from the execution_instances folder. Please note that the .json files in both folders only support names greater than 0.
|
||||
You can import tasks from other machines by simply opening the EasySpider software in this directory, right-clicking "Show Package Contents", and then placing the .json files from the tasks folder in the /Users/Your User Name/Library/Application Support/EasySpider/tasks folder of the other machine. Similarly, execution ID files can be imported by copying the .json files from the execution_instances folder. Please note that the .json files in both folders only support names greater than 0.
|
||||
|
||||
You can quickly navigate to the tasks folder using the following commands:
|
||||
|
||||
cd /Users/$(whoami)/Library/Application\ Support/EasySpider/tasks
|
||||
open .
|
||||
|
||||
If you need to press p one the keyboard to pause and continue the execution of the task, you need to grant the program keyboard monitoring permission.
|
||||
|
||||
======Version Update Instruction======
|
||||
|
||||
Please see more new features for version greater than v0.3.2 at github release page: https://github.com/NaiboWang/EasySpider/releases
|
||||
|
||||
-----v0.3.2-----
|
||||
|
||||
## Update Instruction
|
||||
|
||||
1. Selected child element operations can delete fields and unmark deleted fields in real-time in the browser.
|
||||
2. Selecting child elements adds a selection mode that allows you to choose only the child elements that are present in all blocks or the child elements that are the same as the first selected block.
|
||||
3. In the text input and webpage open options, you can use the extracted field value as a variable for text input, represented by Field["field_name"].
|
||||
4. Files can be downloaded, such as PDF files.
|
||||
5. Fixed a bug where the software could display a blank screen for about 10 seconds after opening, making it usable in intranets, darknets, and any local network.
|
||||
6. Fixed a bug where the current page URL and title could not be extracted.
|
||||
7. Fixed a bug where OCR recognition could fail to extract information.
|
||||
8. Updated extraction logic to save locally every 10 records collected.
|
||||
9. When modifying a task, the default anchor position is set to after the last operation in the task flow.
|
||||
10. Updated Chrome version to 114.
|
||||
|
||||
-----V0.3.1-----
|
||||
|
||||
|
||||
1. Advanced Operations:
|
||||
|
||||
- Custom scripts can be executed in the workflow, including executing JavaScript commands in the browser and invoking scripts at the operating system level. The command's return value can be obtained and recorded, greatly expanding the scope of operations.
|
||||
- Before and after each operation, you can specify a JavaScript command to be executed targeting the current located element.
|
||||
|
||||
2. Custom scripts are also supported in the conditions and loop conditions. The return value of the custom script determines the condition for the judgment of conditions and loops, greatly enhancing the flexibility of tasks. The ability to use the break statement within a loop is added, allowing custom operations to manipulate elements within the loop.
|
||||
|
||||
|
||||
3. Multiple XPath expressions are generated simultaneously for user selection, and the XPath Helper extension is pre-installed for XPath debugging.
|
||||
|
||||
4. Added the functionality to extract the background image URL of elements, current page title, and current page URL.
|
||||
|
||||
5. Added the capability to save screenshots of elements or entire web pages. This feature works best in headless mode.
|
||||
|
||||
6. Added the functionality to download images.
|
||||
|
||||
7. Added OCR recognition of elements. To use this feature, Tesseract library needs to be installed first: https://tesseract-ocr.github.io/tessdoc/Installation.html
|
||||
|
||||
8. Directly extract the return value of executing JavaScript code on elements, allowing for functionalities such as regular expression matching and obtaining the background color of elements.
|
||||
|
||||
9. Added the capability to switch dropdown options and extract the selected value and text of dropdown options.
|
||||
|
||||
10. Significantly improved user guidance and explanations to make the software more user-friendly. This includes instructions on handling iframe tags, explanations of parameter meanings for various options, and explanations on modifying the XPath for loop items, and more.
|
||||
|
||||
11. Added instructions on how to execute tasks from the command line.
|
||||
|
||||
12. Added headless mode configuration, allowing the software to run without a browser interface.
|
||||
|
||||
13. Fixed the issue where Chinese paths couldn't be recognized correctly when using user-configured browser modes.
|
||||
|
||||
14. Fixed the issue where the program would freeze when there was no unconditional branch in the conditional branching.
|
||||
|
||||
15. Fixed the issue where the input box would freeze after saving a task.
|
||||
|
||||
16. Added the option to set the maximum waiting time for page load in the "Open Page" and "Click element" operations.
|
||||
|
||||
17. Added the functionality to move the mouse to an element.
|
||||
|
||||
18. Displays a prompt when an element cannot be found.
|
||||
|
||||
19. Fixed the webpage scrolling bug.
|
||||
|
||||
20. The task name is initialized with the value of the page title upon the first visit.
|
||||
|
||||
21. Added version update prompts.
|
||||
|
||||
22. Added the information of the publisher as requested.
|
||||
|
||||
23. Updated Chrome version to 113.
|
||||
|
||||
|
||||
|
@ -3,7 +3,7 @@
|
||||
MacOS版本的软件有一个问题可能存在,即软件所调用的Chrome软件会在打开后经常性自动更新,但软件所依赖的Chromedriver版本并不会随着Chrome自动更新,从而导致软件打不开Chrome的问题。
|
||||
检查Chrome版本的方式为:进入EasySpider软件内部,即右键软件“显示包内容”,然后进入Contents/Resources/app文件夹内,手动双击打开chrome_mac64软件打开Chrome,然后打开设置->关于Chrome来查看Chrome版本是否和手动打开chromedriver_mac64后显示的版本相同。
|
||||
|
||||
如果不是,请自行到以下网址下载对应自己当前Chrome版本的macOS版本的Chromedriver:https://googlechromelabs.github.io/chrome-for-testing,并将chromedriver文件放在上面提到的Contents/Resources/app文件夹内,更名并替换掉“chromedriver_mac64”文件即可使软件恢复正常使用。
|
||||
如果不是,请自行到以下网址下载对应自己当前Chrome版本(只需看第一个小数点前的大版本号,如122)的macOS版本的Chromedriver:https://googlechromelabs.github.io/chrome-for-testing,并将chromedriver文件放在上面提到的Contents/Resources/app文件夹内,更名并替换掉“chromedriver_mac64”文件即可使软件恢复正常使用。
|
||||
|
||||
如果使用过程中发现其他问题,请到Github Issues页面提issue。
|
||||
|
||||
|
@ -1,6 +1,26 @@
|
||||
由于MacOS复杂的安全性设置,初次打开软件会显示未验证开发者从而不允许打开的问题,请参考以下视频来查看MacOS版本如何打开软件和执行任务:https://www.bilibili.com/video/BV1E34y137fT/
|
||||
由于MacOS复杂的安全性设置,初次打开软件会显示未验证开发者从而不允许打开的问题,请通过以下方式来解锁:
|
||||
|
||||
主要步骤如下:
|
||||
1. 打开系统terminal命令行窗口。
|
||||
|
||||
2. 切换到EasySpider软件目录,如:
|
||||
|
||||
cd ~/Downloads/EasySpider_MacOS
|
||||
|
||||
3. 在EasySpider目录下,使用以下命令运行目录下的`first_time_run.sh`脚本修改软件包属性:
|
||||
|
||||
bash first_time_run.sh
|
||||
|
||||
即可一键解锁并正常使用EasySpider,包括设计阶段程序和执行阶段程序。
|
||||
|
||||
|
||||
执行命令时如果出现类似下面的错误可以忽略,执行完成之后即可打开软件:
|
||||
|
||||
xattr: [Errno 13] Permission denied: 'EasySpider.app/Contents/Resources/app/node_modules/node-window-manager/build/node_gyp_bins/python3'
|
||||
|
||||
|
||||
|
||||
|
||||
以下是另一种方案,请参考以下视频来查看MacOS版本如何打开软件和执行任务:https://www.bilibili.com/video/BV1E34y137fT/
|
||||
|
||||
- 设计阶段 - Apple Arm芯片版MacOS
|
||||
|
||||
|
@ -1,4 +1,4 @@
|
||||
欢迎将软件宣传给更多需要的朋友!
|
||||
欢迎将软件宣传给更多需要的朋友和Star我们的Github仓库!
|
||||
|
||||
官方网址: https://www.easyspider.cn
|
||||
|
||||
@ -6,104 +6,17 @@
|
||||
|
||||
10.x版本MacOS请下载v0.2.0版本使用。
|
||||
|
||||
软件开源代码Github库地址:https://github.com/NaiboWang/EasySpider
|
||||
|
||||
官方文档地址:https://github.com/NaiboWang/EasySpider/wiki
|
||||
|
||||
视频教程:https://www.bilibili.com/video/BV1th411A7ey/
|
||||
|
||||
可以从其他机器导入任务,只需要把其他机器的tasks文件夹里的.json文件放入/Users/你的用户名/Library/Application Support/EasySpider/tasks文件夹里即可。同理执行号文件可以通过复制execution_instances文件夹中的.json文件来导入。注意,两个文件夹里的.json文件只支持命名为大于0的数字。
|
||||
|
||||
可通过以下命令快速进入tasks文件夹:
|
||||
|
||||
cd /Users/$(whoami)/Library/Application\ Support/EasySpider/tasks
|
||||
open .
|
||||
|
||||
如果需要按p键暂停和继续任务的执行,需要赋予程序键盘监控权限。
|
||||
|
||||
======版本更新说明======
|
||||
|
||||
v0.3.2以上版本更新说明请查看Github Release Pages页面:https://github.com/NaiboWang/EasySpider/releases
|
||||
|
||||
-----v0.3.2-----
|
||||
|
||||
## 更新说明
|
||||
|
||||
1. 选中子元素操作可删除字段并在浏览器中实时取消标记被删除的字段。
|
||||
<img src="https://github.com/NaiboWang/EasySpider/assets/30287768/e016c832-6ff9-4814-b86c-38787e73aa30" width=50% />
|
||||
|
||||
2. 选中子元素增加选择模式,可以只选择所有块都有的子元素,或者所有块中和第一个选中的块相同的子元素。
|
||||
<img src="https://github.com/NaiboWang/EasySpider/assets/30287768/0082b11d-96bc-43f1-acdb-8280decb48b4" width=50% />
|
||||
|
||||
3. 输入文字和打开网页选项中可以使用最后一次提取到的字段值**作为变量**进行文字输入,用`Field["字段名"]`表示此变量。
|
||||

|
||||
|
||||
4. 可下载文件,如PDF。
|
||||
5. 修复打开后有可能会白屏10秒左右的Bug,使得在内网,暗网以及任意局域网都可以使用软件。
|
||||
6. 修复提取当前页面URL和标题时可能提取不到的bug。
|
||||
7. 修复OCR识别可能提取不到的bug。
|
||||
8. 提取逻辑更新为每采集10条本地保存一次。
|
||||
9. 修改任务时默认锚点位置为任务流程的最后操作后。
|
||||
10. 更新Chrome版本为114。
|
||||
|
||||
|
||||
------V0.3.1------
|
||||
|
||||
如果下载速度慢,可以考虑中国境内下载地址:[中国境内下载地址](https://github.com/NaiboWang/EasySpider/releases/download/v0.3.0/Download_Link_Address_in_China_Mainland.txt)。
|
||||
|
||||
### 强烈建议大家观看新特性讲解视频
|
||||
|
||||
B站最新版特性视频已上传,新视频非常有用,推荐大家观看。
|
||||
|
||||
[【重要】自定义条件判断之使用循环项内的JS命令返回值 - 第二弹](https://www.bilibili.com/video/BV1mu411x7Nn/)
|
||||
|
||||
[如何同时执行多个任务(并行多开)](https://www.bilibili.com/video/BV13c411G7LE/)
|
||||
|
||||
[如何执行自己写的JS代码和系统代码 (自定义操作)](https://www.bilibili.com/video/BV1qs4y1z7Hc/)
|
||||
|
||||
[如何自定义循环和判断条件 - 第一弹](https://www.bilibili.com/video/BV1Ys4y1z777/)
|
||||
|
||||
[如何对元素和网页截图及(无头模式)命令行执行指南](https://www.bilibili.com/video/BV1dV4y1z764/)
|
||||
|
||||
[OCR识别元素内容功能](https://www.bilibili.com/video/BV1xz4y1b72D/)
|
||||
|
||||
注意,v0.3.1版本任务task文件夹内`.json`文件和之前所有版本均不兼容,请重新设计v0.3.1版本任务。
|
||||
|
||||
## 更新说明
|
||||
1. 自定义操作:
|
||||
- 可以在任务流程中**执行自定义脚本**,包括在浏览器中**执行Javascript指令**以及**操作系统级别的脚本调用**并可**得到命令返回值并记录**,大大扩展了可操作空间。
|
||||
|
||||

|
||||
|
||||
- 在每一个操作执行前和执行后,都可以指定执行一段针对当前定位元素的JavaScript指令。
|
||||
|
||||
<img src="https://github.com/NaiboWang/EasySpider/assets/30287768/dde64388-5668-40ff-951e-fb8f60655c49" height=50% width=50%>
|
||||
|
||||
2. **判断条件和循环条件**中同样增加了**执行自定义脚本**,并根据自定义脚本的返回值是否为真来作为条件判断和循环的判断条件,同样极大的增加了任务的可操作性。循环中增加了用代码break的操作设定,自定义操作可以操作循环内元素。
|
||||

|
||||
<img src="https://github.com/NaiboWang/EasySpider/assets/30287768/5ce7cf50-e5c9-4714-a83b-9c65934e9c68" width=50%></img>
|
||||
|
||||
3. 可同时生成多种XPath供用户选择,并**预装了XPath Helper扩展**供大家调试XPath。
|
||||
4. 增加采集元素背景图片地址,当前页面标题,当前页面URL地址功能。
|
||||
5. 增加保存元素截图功能,如要截图某元素或整个网页页面,可以用此功能(配合无头模式效果更好)。
|
||||
6. 增加下载图片功能。
|
||||
7. 增加OCR识别元素功能(使用此功能需首先自行安装Tesseract库:[https://blog.csdn.net/u010454030/article/details/80515501](https://blog.csdn.net/u010454030/article/details/80515501))
|
||||
|
||||
8. 可直接提取对元素执行JavaScript代码后的返回值,实现如正则表达式,获得元素背景颜色等功能。
|
||||
9. 增加切换下拉选项功能,采集下拉选项正在选中的值和文本。
|
||||
<img src="https://github.com/NaiboWang/EasySpider/assets/30287768/c0b2bec1-2a97-4516-930e-1b310697212b" width=50%></img>
|
||||
|
||||

|
||||
|
||||
10. 大幅增加使用提示和说明,使软件更易用(如增加了iframe标签的处理方式说明,各个选项的参数意义,以及循环项XPath的修改说明等等)。
|
||||
11. 执行命令时增加了如何用命令行执行任务的提示:[https://github.com/NaiboWang/EasySpider/wiki/Argument-Instruction](https://github.com/NaiboWang/EasySpider/wiki/Argument-Instruction)。
|
||||

|
||||
12. 增加并行多开模式。
|
||||
13. 增加无头模式,即无浏览器界面模式配置。
|
||||
14. 修复了使用用户配置浏览器模式下的中文路径不能正确识别的问题。
|
||||
15. 修复了条件分支没有无条件分支时会卡死的问题。
|
||||
16. 修复了保存任务后会输入框卡死的问题。
|
||||
17. 打开网页操作和点击元素操作新增设置页面最长加载等待时间。
|
||||
18. 增加了鼠标移动到元素功能。
|
||||
19. 找不到元素时会提示。
|
||||
20. 修复网页滚动Bug。
|
||||
21. 增加新增提取数据字段操作。
|
||||
22. 任务名称初始化为第一次进入页面的标题值。
|
||||
23. 增加版本更新提示。
|
||||
24. 应要求增加出品方信息。
|
||||
25. 更新chrome版本为113。
|
||||
|
||||
|
||||
|
||||
|
||||
|
@ -23,7 +23,7 @@ For more complex operations, please download the source code and compile it for
|
||||
"""
|
||||
|
||||
# 请在下面编写你的代码,不要有代码缩进!!! | Please write your code below, do not indent the code!!!
|
||||
|
||||
print(globals())
|
||||
# 导包 | Import packages
|
||||
from selenium.common.exceptions import ElementClickInterceptedException
|
||||
|
||||
@ -56,3 +56,20 @@ finally:
|
||||
print("All parameters:", self.outputParameters)
|
||||
print(test(3))
|
||||
print("执行完毕|Execution completed")
|
||||
|
||||
import time
|
||||
time.sleep(3)
|
||||
|
||||
def new_line(outputParameters, maxViewLength, record):
|
||||
line = []
|
||||
print("Use this function to print a new line in the console")
|
||||
i = 0
|
||||
for value in outputParameters.values():
|
||||
line.append(value)
|
||||
if record[i]:
|
||||
print(value[:maxViewLength], " ", end="")
|
||||
i += 1
|
||||
print("")
|
||||
return line
|
||||
|
||||
new_line(self.outputParameters, 10, [True, True, True, True, True, True, True, True, True, True, True, True, True, True, True])
|
@ -1,86 +1,15 @@
|
||||
Official Site: https://www.easyspider.net
|
||||
|
||||
Welcome to promote this software to other friends.
|
||||
Welcome to promote this software to other friends and star our Github Repository!
|
||||
|
||||
This version is for Windows 7 and above, including both 32-bit and 64-bit version. Please note that this version of the Chrome browser will always remain at version 109 and will not update with Chrome updates (for compatibility with Windows 7). Therefore, if you want to use the latest version of the Chrome browser for data scraping, please run the x64 version of EasySpider on Windows 10 x64 or higher systems.
|
||||
This version is for Windows 7 and above, including both 32-bit and 64-bit version. Please note that this version of the Chrome browser will always remain at version 109 and will not update with Chrome updates (for compatibility with Windows 7). Therefore, if you want to use the latest version of the Chrome browser for data scraping, please run the x64 version of EasySpider on Windows 10 x64 or higher systems. There is no version support for Windows Server 2012 and below. These systems require manual compilation for execution.
|
||||
|
||||
Video Tutorial: https://youtube.com/playlist?list=PL0kEFEkWrT7mt9MUlEBV2DTo1QsaanUTp
|
||||
|
||||
The software's open-source code repository on GitHub: https://github.com/NaiboWang/EasySpider
|
||||
|
||||
Official documentation can be found at: https://github.com/NaiboWang/EasySpider/wiki
|
||||
|
||||
The software is totally not trojan/virus! If mistaken by antivirus software such as Windows Defender as a virus, please recover it, or open "EasySpider.bat" to run our software instead.
|
||||
|
||||
Tasks can be imported from other machines by simply placing the .json files from the "tasks" folder of those machines into the "tasks" folder of this directory. Similarly, execution instance files can be imported by copying the .json files from the "execution_instances" folder. Note that only files named with a number greater than 0 are supported in both folders.
|
||||
|
||||
|
||||
======Version New Features======
|
||||
|
||||
Please see more new features for version greater than v0.3.2 at github release page: https://github.com/NaiboWang/EasySpider/releases
|
||||
|
||||
-----v0.3.2-----
|
||||
|
||||
## Update Instruction
|
||||
|
||||
1. Selected child element operations can delete fields and unmark deleted fields in real-time in the browser.
|
||||
2. Selecting child elements adds a selection mode that allows you to choose only the child elements that are present in all blocks or the child elements that are the same as the first selected block.
|
||||
3. In the text input and webpage open options, you can use the extracted field value as a variable for text input, represented by Field["field_name"].
|
||||
4. Files can be downloaded, such as PDF files.
|
||||
5. Fixed a bug where the software could display a blank screen for about 10 seconds after opening, making it usable in intranets, darknets, and any local network.
|
||||
6. Fixed a bug where the current page URL and title could not be extracted.
|
||||
7. Fixed a bug where OCR recognition could fail to extract information.
|
||||
8. Updated extraction logic to save locally every 10 records collected.
|
||||
9. When modifying a task, the default anchor position is set to after the last operation in the task flow.
|
||||
10. Updated Chrome version to 114.
|
||||
|
||||
-----v0.3.1-----
|
||||
|
||||
## Update Instruction
|
||||
|
||||
|
||||
1. Advanced Operations:
|
||||
|
||||
- Custom scripts can be executed in the workflow, including executing JavaScript commands in the browser and invoking scripts at the operating system level. The command's return value can be obtained and recorded, greatly expanding the scope of operations.
|
||||
- Before and after each operation, you can specify a JavaScript command to be executed targeting the current located element.
|
||||
|
||||
2. Custom scripts are also supported in the conditions and loop conditions. The return value of the custom script determines the condition for the judgment of conditions and loops, greatly enhancing the flexibility of tasks. The ability to use the break statement within a loop is added, allowing custom operations to manipulate elements within the loop.
|
||||
|
||||
|
||||
3. Multiple XPath expressions are generated simultaneously for user selection, and the XPath Helper extension is pre-installed for XPath debugging.
|
||||
|
||||
4. Added the functionality to extract the background image URL of elements, current page title, and current page URL.
|
||||
|
||||
5. Added the capability to save screenshots of elements or entire web pages. This feature works best in headless mode.
|
||||
|
||||
6. Added the functionality to download images.
|
||||
|
||||
7. Added OCR recognition of elements. To use this feature, Tesseract library needs to be installed first: https://tesseract-ocr.github.io/tessdoc/Installation.html
|
||||
|
||||
8. Directly extract the return value of executing JavaScript code on elements, allowing for functionalities such as regular expression matching and obtaining the background color of elements.
|
||||
|
||||
9. Added the capability to switch dropdown options and extract the selected value and text of dropdown options.
|
||||
|
||||
10. Significantly improved user guidance and explanations to make the software more user-friendly. This includes instructions on handling iframe tags, explanations of parameter meanings for various options, and explanations on modifying the XPath for loop items, and more.
|
||||
|
||||
11. Added instructions on how to execute tasks from the command line.
|
||||
|
||||
12. Added headless mode configuration, allowing the software to run without a browser interface.
|
||||
|
||||
13. Fixed the issue where Chinese paths couldn't be recognized correctly when using user-configured browser modes.
|
||||
|
||||
14. Fixed the issue where the program would freeze when there was no unconditional branch in the conditional branching.
|
||||
|
||||
15. Fixed the issue where the input box would freeze after saving a task.
|
||||
|
||||
16. Added the option to set the maximum waiting time for page load in the "Open Page" and "Click element" operations.
|
||||
|
||||
17. Added the functionality to move the mouse to an element.
|
||||
|
||||
18. Displays a prompt when an element cannot be found.
|
||||
|
||||
19. Fixed the webpage scrolling bug.
|
||||
|
||||
20. The task name is initialized with the value of the page title upon the first visit.
|
||||
|
||||
21. Added version update prompts.
|
||||
|
||||
22. Added the information of the publisher as requested.
|
||||
|
||||
23. Updated Chrome version to 113.
|
||||
Tasks can be imported from other machines by simply placing the .json files from the "tasks" folder of those machines into the "tasks" folder of this directory. Similarly, execution instance files can be imported by copying the .json files from the "execution_instances" folder. Note that only files named with a number greater than 0 are supported in both folders.
|
File diff suppressed because one or more lines are too long
@ -1 +1 @@
|
||||
{"id":228,"name":"[2312.02977] Exploring the nonclassical dynamics of the \"classical'' Schrödinger equation","url":"https://arxiv.org/abs/2312.02977","links":"https://arxiv.org/abs/2312.02977","create_time":"12/7/2023, 2:44:58 AM","update_time":"12/7/2023, 2:56:47 AM","version":"0.6.0","saveThreshold":10,"quitWaitTime":60,"environment":1,"maximizeWindow":0,"maxViewLength":15,"recordLog":1,"outputFormat":"xlsx","saveName":"current_time","inputExcel":"","startFromExit":0,"pauseKey":"p","containJudge":false,"desc":"https://arxiv.org/abs/2312.02977","inputParameters":[{"id":0,"name":"urlList_0","nodeId":1,"nodeName":"打开网页","value":"https://arxiv.org/abs/2312.02977","desc":"要采集的网址列表,多行以\\n分开","type":"text","exampleValue":"https://arxiv.org/abs/2312.02977"}],"outputParameters":[],"graph":[{"index":0,"id":0,"parentId":0,"type":-1,"option":0,"title":"root","sequence":[1,2],"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0},"isInLoop":false},{"id":1,"index":1,"parentId":0,"type":0,"option":1,"title":"打开网页","sequence":[],"isInLoop":false,"position":0,"parameters":{"useLoop":false,"xpath":"","wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"url":"https://arxiv.org/abs/2312.02977","links":"https://arxiv.org/abs/2312.02977","maxWaitTime":10,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"cookies":""}},{"id":2,"index":2,"parentId":0,"type":0,"option":2,"title":"点击Download PDF","sequence":[],"isInLoop":false,"position":1,"parameters":{"history":4,"tabIndex":-1,"useLoop":false,"xpath":"//*[contains(@class, \"download-pdf\")]","iframe":false,"wait":2,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"clickWay":0,"maxWaitTime":10,"params":[],"alertHandleType":0,"allXPaths":["/html/body/div[2]/main[1]/div[1]/div[1]/div[2]/div[1]/ul[1]/li[1]/a[1]","//a[contains(., 'Download P')]","//A[@class='abs-button download-pdf']","/html/body/div[last()-3]/main/div/div/div[last()-2]/div[last()-5]/ul/li[last()-2]/a"]}}]}
|
||||
{"id":228,"name":"[2312.02977] Exploring the nonclassical dynamics of the \"classical'' Schrödinger equation","url":"https://arxiv.org/abs/2312.02977","links":"https://arxiv.org/abs/2312.02977","create_time":"12/7/2023, 2:44:58 AM","update_time":"2024-01-05 22:08:46","version":"0.6.0","saveThreshold":10,"quitWaitTime":3,"environment":1,"maximizeWindow":0,"maxViewLength":15,"recordLog":1,"outputFormat":"xlsx","saveName":"TTT","dataWriteMode":3,"inputExcel":"","startFromExit":0,"pauseKey":"p","containJudge":false,"browser":"chrome","removeDuplicate":0,"desc":"https://arxiv.org/abs/2312.02977","inputParameters":[{"id":0,"name":"urlList_0","nodeId":1,"nodeName":"打开网页","value":"https://arxiv.org/abs/2312.02977","desc":"要采集的网址列表,多行以\\n分开","type":"text","exampleValue":"https://arxiv.org/abs/2312.02977"},{"id":1,"name":"loopTimes_1","nodeId":5,"nodeName":"循环 - 单个元素","desc":"循环循环 - 单个元素执行的次数(0代表无限循环)","type":"int","exampleValue":10,"value":10}],"outputParameters":[],"graph":[{"index":0,"id":0,"parentId":0,"type":-1,"option":0,"title":"root","sequence":[1,5],"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0},"isInLoop":false},{"id":1,"index":1,"parentId":0,"type":0,"option":1,"title":"打开网页","sequence":[],"isInLoop":false,"position":0,"parameters":{"useLoop":false,"xpath":"","wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"url":"https://arxiv.org/abs/2312.02977","links":"https://arxiv.org/abs/2312.02977","maxWaitTime":10,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"cookies":""}},{"id":3,"index":2,"parentId":2,"type":0,"option":2,"title":"点击Download PDF","sequence":[],"isInLoop":true,"position":0,"parameters":{"history":4,"tabIndex":-1,"useLoop":false,"xpath":"//*[contains(@class, \"download-pdf\")]","iframe":false,"wait":2,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"clickWay":0,"maxWaitTime":10,"params":[],"alertHandleType":0,"allXPaths":["/html/body/div[2]/main[1]/div[1]/div[1]/div[2]/div[1]/ul[1]/li[1]/a[1]","//a[contains(., 'Download P')]","//A[@class='abs-button download-pdf']","/html/body/div[last()-3]/main/div/div/div[last()-2]/div[last()-5]/ul/li[last()-2]/a"]}},{"id":-1,"index":3,"parentId":0,"type":0,"option":2,"title":"点击Download PDF","sequence":[],"isInLoop":false,"position":2,"parameters":{"history":4,"tabIndex":-1,"useLoop":false,"xpath":"//*[contains(@class, \"download-pdf\")]","iframe":false,"wait":2,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"clickWay":0,"maxWaitTime":10,"params":[],"alertHandleType":0,"allXPaths":["/html/body/div[2]/main[1]/div[1]/div[1]/div[2]/div[1]/ul[1]/li[1]/a[1]","//a[contains(., 'Download P')]","//A[@class='abs-button download-pdf']","/html/body/div[last()-3]/main/div/div/div[last()-2]/div[last()-5]/ul/li[last()-2]/a"]}},{"id":-1,"index":4,"parentId":0,"type":0,"option":2,"title":"点击Download PDF","sequence":[],"isInLoop":false,"position":3,"parameters":{"history":4,"tabIndex":-1,"useLoop":false,"xpath":"//*[contains(@class, \"download-pdf\")]","iframe":false,"wait":2,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"clickWay":0,"maxWaitTime":10,"params":[],"alertHandleType":0,"allXPaths":["/html/body/div[2]/main[1]/div[1]/div[1]/div[2]/div[1]/ul[1]/li[1]/a[1]","//a[contains(., 'Download P')]","//A[@class='abs-button download-pdf']","/html/body/div[last()-3]/main/div/div/div[last()-2]/div[last()-5]/ul/li[last()-2]/a"]}},{"id":2,"index":5,"parentId":0,"type":1,"option":8,"title":"循环 - 单个元素","sequence":[2],"isInLoop":false,"position":1,"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"//body","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"loopType":0,"pathList":"","textList":"","code":"","waitTime":0,"exitCount":10,"exitElement":"//body","historyWait":2,"breakMode":0,"breakCode":"","breakCodeWaitTime":0,"skipCount":0}}]}
|
@ -1 +1 @@
|
||||
{"id":229,"name":"知乎 - 有问题,就会有答案","url":"https://www.zhihu.com","links":"https://www.zhihu.com","create_time":"07/12/2023, 03:26:24","update_time":"07/12/2023, 03:43:34","version":"0.6.0","saveThreshold":10,"quitWaitTime":6,"environment":0,"maximizeWindow":0,"maxViewLength":15,"recordLog":1,"outputFormat":"xlsx","saveName":"current_time","inputExcel":"","startFromExit":0,"pauseKey":"t","containJudge":false,"desc":"https://www.zhihu.com","inputParameters":[{"id":0,"name":"urlList_0","nodeId":1,"nodeName":"打开网页","value":"https://www.zhihu.com","desc":"要采集的网址列表,多行以\\n分开","type":"text","exampleValue":"https://www.zhihu.com"}],"outputParameters":[{"id":0,"name":"参数1_文本","desc":"","type":"text","recordASField":1,"exampleValue":"死刑执行前可以谎称肚子痛,想排泄粪便,籍此拖延时间吗?"}],"graph":[{"index":0,"id":0,"parentId":0,"type":-1,"option":0,"title":"root","sequence":[1,2],"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0},"isInLoop":false},{"id":1,"index":1,"parentId":0,"type":0,"option":1,"title":"打开网页","sequence":[],"isInLoop":false,"position":0,"parameters":{"useLoop":false,"xpath":"","wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"url":"https://www.zhihu.com","links":"https://www.zhihu.com","maxWaitTime":10,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"cookies":""}},{"id":2,"index":2,"parentId":0,"type":1,"option":8,"title":"循环采集数据","sequence":[3],"isInLoop":false,"position":1,"parameters":{"history":5,"tabIndex":-1,"useLoop":false,"xpath":"/html/body/div[1]/div[1]/main[1]/div[1]/div[2]/div[1]/div[1]/div[2]/div[1]/div[1]/div[1]/div/div[1]/div[1]/div[1]/div[1]/h2[1]/div[1]","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"loopType":1,"pathList":"","textList":"","code":"","waitTime":0,"exitCount":0,"exitElement":"//body","historyWait":2,"breakMode":0,"breakCode":"","breakCodeWaitTime":0,"allXPaths":["/html/body/div[1]/div[1]/main[1]/div[1]/div[2]/div[1]/div[1]/div[2]/div[1]/div[1]/div[1]/div[2]/div[1]/div[1]/div[1]/div[1]/h2[1]/div[1]","//div[contains(., '死刑执行前可以谎称肚')]","/html/body/div[last()-7]/div/main/div/div/div[last()-1]/div/div/div/div/div/div[last()-12]/div/div/div/div/h2/div"]}},{"id":3,"index":3,"parentId":2,"type":0,"option":3,"title":"提取数据","sequence":[],"isInLoop":true,"position":0,"parameters":{"history":5,"tabIndex":-1,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"clear":0,"newLine":1,"params":[{"nodeType":0,"contentType":0,"relative":true,"name":"参数1_文本","desc":"","extractType":0,"relativeXPath":"","allXPaths":"","exampleValues":[{"num":0,"value":"死刑执行前可以谎称肚子痛,想排泄粪便,籍此拖延时间吗?"}],"unique_index":"onlvi030w9jlpu5tjzb","iframe":false,"default":"","paraType":"text","recordASField":1,"beforeJS":"","beforeJSWaitTime":0,"JS":"","JSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"downloadPic":0}],"loopType":1}}]}
|
||||
{"id":229,"name":"知乎 - 有问题,就会有答案","url":"https://www.zhihu.com","links":"https://www.zhihu.com","create_time":"07/12/2023, 03:26:24","update_time":"2023-12-27 20:05:50","version":"0.6.0","saveThreshold":10,"quitWaitTime":6,"environment":0,"maximizeWindow":0,"maxViewLength":15,"recordLog":1,"outputFormat":"xlsx","saveName":"current_time","dataWriteMode":1,"inputExcel":"","startFromExit":0,"pauseKey":"t","containJudge":false,"browser":"chrome","removeDuplicate":0,"desc":"知了个乎","inputParameters":[{"id":0,"name":"urlList_0","nodeId":1,"nodeName":"打开网页","value":"https://www.zhihu.com","desc":"要采集的网址列表,多行以\\n分开","type":"text","exampleValue":"https://www.zhihu.com"},{"id":1,"name":"loopTimes_1","nodeId":4,"nodeName":"循环 - 单个元素","desc":"循环循环 - 单个元素执行的次数(0代表无限循环)","type":"int","exampleValue":0,"value":0}],"outputParameters":[{"id":0,"name":"参数1_文本","desc":"","type":"text","recordASField":1,"exampleValue":"死刑执行前可以谎称肚子痛,想排泄粪便,籍此拖延时间吗?"}],"graph":[{"index":0,"id":0,"parentId":0,"type":-1,"option":0,"title":"root","sequence":[1,4,2],"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0},"isInLoop":false},{"id":1,"index":1,"parentId":0,"type":0,"option":1,"title":"打开网页","sequence":[],"isInLoop":false,"position":0,"parameters":{"useLoop":false,"xpath":"","wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"url":"https://www.zhihu.com","links":"https://www.zhihu.com","maxWaitTime":10,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"cookies":""}},{"id":3,"index":2,"parentId":0,"type":1,"option":8,"title":"循环采集数据","sequence":[3],"isInLoop":false,"position":2,"parameters":{"history":5,"tabIndex":-1,"useLoop":false,"xpath":"/html/body/div[1]/div[1]/main[1]/div[1]/div[2]/div[1]/div[1]/div[2]/div[1]/div[1]/div[1]/div/div[1]/div[1]/div[1]/div[1]/h2[1]/div[1]","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"loopType":1,"pathList":"","textList":"","code":"","waitTime":0,"exitCount":0,"exitElement":"//body","historyWait":2,"breakMode":0,"breakCode":"","breakCodeWaitTime":0,"allXPaths":["/html/body/div[1]/div[1]/main[1]/div[1]/div[2]/div[1]/div[1]/div[2]/div[1]/div[1]/div[1]/div[2]/div[1]/div[1]/div[1]/div[1]/h2[1]/div[1]","//div[contains(., '死刑执行前可以谎称肚')]","/html/body/div[last()-7]/div/main/div/div/div[last()-1]/div/div/div/div/div/div[last()-12]/div/div/div/div/h2/div"]}},{"id":4,"index":3,"parentId":3,"type":0,"option":3,"title":"提取数据","sequence":[],"isInLoop":true,"position":0,"parameters":{"history":5,"tabIndex":-1,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"clear":0,"newLine":1,"params":[{"nodeType":0,"contentType":0,"relative":true,"name":"参数1_文本","desc":"","extractType":0,"relativeXPath":"","allXPaths":"","exampleValues":[{"num":0,"value":"死刑执行前可以谎称肚子痛,想排泄粪便,籍此拖延时间吗?"}],"unique_index":"onlvi030w9jlpu5tjzb","iframe":false,"default":"","paraType":"text","recordASField":1,"beforeJS":"","beforeJSWaitTime":0,"JS":"","JSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"downloadPic":0}],"loopType":1}},{"id":2,"index":4,"parentId":0,"type":1,"option":8,"title":"循环 - 单个元素","sequence":[],"isInLoop":false,"position":1,"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"loopType":0,"pathList":"","textList":"","code":"","waitTime":0,"exitCount":0,"exitElement":"//body","historyWait":2,"breakMode":0,"breakCode":"","breakCodeWaitTime":0,"skipCount":0}}]}
|
@ -1 +1 @@
|
||||
{"id":70,"name":"京东全球版-专业的综合网上购物商城","url":"https://www.jd.com","links":"https://www.jd.com","create_time":"5/24/2023, 8:21:45 PM","version":"0.3.1","containJudge":false,"desc":"https://www.jd.com","inputParameters":[{"id":0,"name":"urlList_0","nodeId":1,"nodeName":"打开网页","value":"https://www.jd.com","desc":"要采集的网址列表,多行以\\n分开","type":"string","exampleValue":"https://www.jd.com"}],"outputParameters":[],"graph":[{"index":0,"id":0,"parentId":0,"type":-1,"option":0,"title":"root","sequence":[1,2],"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"","wait":0},"isInLoop":false},{"id":1,"index":1,"parentId":0,"type":0,"option":1,"title":"打开网页","sequence":[],"isInLoop":false,"position":0,"parameters":{"useLoop":false,"xpath":"","wait":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"url":"https://www.jd.com","links":"https://www.jd.com","maxWaitTime":10,"scrollType":0,"scrollCount":1,"scrollWaitTime":1}},{"id":2,"index":2,"parentId":0,"type":1,"option":8,"title":"循环","sequence":[3],"isInLoop":false,"position":1,"parameters":{"history":4,"tabIndex":-1,"useLoop":false,"xpath":"/html/body/div[5]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div","wait":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"loopType":1,"pathList":"","textList":"","code":"","waitTime":0,"exitCount":0,"historyWait":2,"breakMode":0,"breakCode":"","breakCodeWaitTime":0,"allXPaths":""}},{"id":3,"index":3,"parentId":2,"type":0,"option":7,"title":"移动到元素","sequence":[],"isInLoop":true,"position":0,"parameters":{"history":4,"tabIndex":-1,"useLoop":true,"xpath":"/html/body/div[5]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div","wait":2,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"allXPaths":"","loopType":1}}]}
|
||||
{"id":-2,"name":"京东全球版-专业的综合网上购物商城","url":"https://www.jd.com","links":"https://www.jd.com","create_time":"5/24/2023, 8:21:45 PM","version":"0.3.1","containJudge":false,"desc":"https://www.jd.com","inputParameters":[{"id":0,"name":"urlList_0","nodeId":1,"nodeName":"打开网页","value":"https://www.jd.com","desc":"要采集的网址列表,多行以\\n分开","type":"string","exampleValue":"https://www.jd.com"}],"outputParameters":[],"graph":[{"index":0,"id":0,"parentId":0,"type":-1,"option":0,"title":"root","sequence":[1,2],"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"","wait":0},"isInLoop":false},{"id":1,"index":1,"parentId":0,"type":0,"option":1,"title":"打开网页","sequence":[],"isInLoop":false,"position":0,"parameters":{"useLoop":false,"xpath":"","wait":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"url":"https://www.jd.com","links":"https://www.jd.com","maxWaitTime":10,"scrollType":0,"scrollCount":1,"scrollWaitTime":1}},{"id":2,"index":2,"parentId":0,"type":1,"option":8,"title":"循环","sequence":[3],"isInLoop":false,"position":1,"parameters":{"history":4,"tabIndex":-1,"useLoop":false,"xpath":"/html/body/div[5]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div","wait":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"loopType":1,"pathList":"","textList":"","code":"","waitTime":0,"exitCount":0,"historyWait":2,"breakMode":0,"breakCode":"","breakCodeWaitTime":0,"allXPaths":""}},{"id":3,"index":3,"parentId":2,"type":0,"option":7,"title":"移动到元素","sequence":[],"isInLoop":true,"position":0,"parameters":{"history":4,"tabIndex":-1,"useLoop":true,"xpath":"/html/body/div[5]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div","wait":2,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"allXPaths":"","loopType":1}}]}
|
File diff suppressed because one or more lines are too long
@ -1,102 +1,15 @@
|
||||
欢迎将软件宣传给更多需要的朋友!
|
||||
欢迎将软件宣传给更多需要的朋友和Star我们的Github仓库!
|
||||
|
||||
官方网址: https://www.easyspider.cn
|
||||
|
||||
支持Windows 7及以上版本,包括32位系统和64位系统。注意此版本的Chrome浏览器永远都是109,不会随着Chrome更新而更新(为了兼容Win 7系统),因此如果想用最新版Chrome浏览器采集数据,请在Windows 10 x64及以上系统上运行x64版本的EasySpider。
|
||||
支持Windows 7及以上版本,包括32位系统和64位系统。注意此版本的Chrome浏览器永远都是109,不会随着Chrome更新而更新(为了兼容Win 7系统),因此如果想用最新版Chrome浏览器采集数据,请在Windows 10 x64及以上系统上运行x64版本的EasySpider。无任何版本支持Windows Server 2012及以下版本系统,这些系统下需要自行编译运行。
|
||||
|
||||
软件开源代码Github库地址:https://github.com/NaiboWang/EasySpider
|
||||
|
||||
官方文档地址:https://github.com/NaiboWang/EasySpider/wiki
|
||||
|
||||
视频教程:https://www.bilibili.com/video/BV1th411A7ey/
|
||||
|
||||
这个软件绝对不是特洛伊木马/病毒!如果被像Windows Defender这样的杀毒软件误认为是病毒,请进行恢复,或者打开“EasySpider.bat”来运行我们的软件。
|
||||
|
||||
可以从其他机器导入任务,只需要把其他机器的tasks文件夹里的.json文件放入此目录的tasks文件夹里即可。同理执行号文件可以通过复制execution_instances文件夹中的.json文件来导入。注意,两个文件夹里的.json文件只支持命名为大于0的数字。
|
||||
|
||||
|
||||
======版本更新说明======
|
||||
|
||||
v0.3.2以上版本更新说明请查看Github Release Pages页面:https://github.com/NaiboWang/EasySpider/releases
|
||||
|
||||
-----v0.3.2-----
|
||||
|
||||
## 更新说明
|
||||
|
||||
1. 选中子元素操作可删除字段并在浏览器中实时取消标记被删除的字段。
|
||||
<img src="https://github.com/NaiboWang/EasySpider/assets/30287768/e016c832-6ff9-4814-b86c-38787e73aa30" width=50% />
|
||||
|
||||
2. 选中子元素增加选择模式,可以只选择所有块都有的子元素,或者所有块中和第一个选中的块相同的子元素。
|
||||
<img src="https://github.com/NaiboWang/EasySpider/assets/30287768/0082b11d-96bc-43f1-acdb-8280decb48b4" width=50% />
|
||||
|
||||
3. 输入文字和打开网页选项中可以使用最后一次提取到的字段值**作为变量**进行文字输入,用`Field["字段名"]`表示此变量。
|
||||

|
||||
|
||||
4. 可下载文件,如PDF。
|
||||
5. 修复打开后有可能会白屏10秒左右的Bug,使得在内网,暗网以及任意局域网都可以使用软件。
|
||||
6. 修复提取当前页面URL和标题时可能提取不到的bug。
|
||||
7. 修复OCR识别可能提取不到的bug。
|
||||
8. 提取逻辑更新为每采集10条本地保存一次。
|
||||
9. 修改任务时默认锚点位置为任务流程的最后操作后。
|
||||
10. 更新Chrome版本为114。
|
||||
|
||||
|
||||
-----v0.3.1-----
|
||||
|
||||
### 强烈建议大家观看新特性讲解视频
|
||||
|
||||
B站最新版特性视频已上传,新视频非常有用,推荐大家观看。
|
||||
|
||||
[【重要】自定义条件判断之使用循环项内的JS命令返回值 - 第二弹](https://www.bilibili.com/video/BV1mu411x7Nn/)
|
||||
|
||||
[如何同时执行多个任务(并行多开)](https://www.bilibili.com/video/BV13c411G7LE/)
|
||||
|
||||
[如何执行自己写的JS代码和系统代码 (自定义操作)](https://www.bilibili.com/video/BV1qs4y1z7Hc/)
|
||||
|
||||
[如何自定义循环和判断条件 - 第一弹](https://www.bilibili.com/video/BV1Ys4y1z777/)
|
||||
|
||||
[如何对元素和网页截图及(无头模式)命令行执行指南](https://www.bilibili.com/video/BV1dV4y1z764/)
|
||||
|
||||
[OCR识别元素内容功能](https://www.bilibili.com/video/BV1xz4y1b72D/)
|
||||
|
||||
注意,v0.3.1版本任务task文件夹内`.json`文件和之前所有版本均不兼容,请重新设计v0.3.1版本任务。
|
||||
|
||||
## 更新说明
|
||||
1. 自定义操作:
|
||||
- 可以在任务流程中**执行自定义脚本**,包括在浏览器中**执行Javascript指令**以及**操作系统级别的脚本调用**并可**得到命令返回值并记录**,大大扩展了可操作空间。
|
||||
|
||||

|
||||
|
||||
- 在每一个操作执行前和执行后,都可以指定执行一段针对当前定位元素的JavaScript指令。
|
||||
|
||||
<img src="https://github.com/NaiboWang/EasySpider/assets/30287768/dde64388-5668-40ff-951e-fb8f60655c49" height=50% width=50%>
|
||||
|
||||
2. **判断条件和循环条件**中同样增加了**执行自定义脚本**,并根据自定义脚本的返回值是否为真来作为条件判断和循环的判断条件,同样极大的增加了任务的可操作性。循环中增加了用代码break的操作设定,自定义操作可以操作循环内元素。
|
||||

|
||||
<img src="https://github.com/NaiboWang/EasySpider/assets/30287768/5ce7cf50-e5c9-4714-a83b-9c65934e9c68" width=50%></img>
|
||||
|
||||
3. 可同时生成多种XPath供用户选择,并**预装了XPath Helper扩展**供大家调试XPath。
|
||||
4. 增加采集元素背景图片地址,当前页面标题,当前页面URL地址功能。
|
||||
5. 增加保存元素截图功能,如要截图某元素或整个网页页面,可以用此功能(配合无头模式效果更好)。
|
||||
6. 增加下载图片功能。
|
||||
7. 增加OCR识别元素功能(使用此功能需首先自行安装Tesseract库:[https://blog.csdn.net/u010454030/article/details/80515501](https://blog.csdn.net/u010454030/article/details/80515501))
|
||||
|
||||
8. 可直接提取对元素执行JavaScript代码后的返回值,实现如正则表达式,获得元素背景颜色等功能。
|
||||
9. 增加切换下拉选项功能,采集下拉选项正在选中的值和文本。
|
||||
<img src="https://github.com/NaiboWang/EasySpider/assets/30287768/c0b2bec1-2a97-4516-930e-1b310697212b" width=50%></img>
|
||||
|
||||

|
||||
|
||||
10. 大幅增加使用提示和说明,使软件更易用(如增加了iframe标签的处理方式说明,各个选项的参数意义,以及循环项XPath的修改说明等等)。
|
||||
11. 执行命令时增加了如何用命令行执行任务的提示:[https://github.com/NaiboWang/EasySpider/wiki/Argument-Instruction](https://github.com/NaiboWang/EasySpider/wiki/Argument-Instruction)。
|
||||

|
||||
12. 增加并行多开模式。
|
||||
13. 增加无头模式,即无浏览器界面模式配置。
|
||||
14. 修复了使用用户配置浏览器模式下的中文路径不能正确识别的问题。
|
||||
15. 修复了条件分支没有无条件分支时会卡死的问题。
|
||||
16. 修复了保存任务后会输入框卡死的问题。
|
||||
17. 打开网页操作和点击元素操作新增设置页面最长加载等待时间。
|
||||
18. 增加了鼠标移动到元素功能。
|
||||
19. 找不到元素时会提示。
|
||||
20. 修复网页滚动Bug。
|
||||
21. 增加新增提取数据字段操作。
|
||||
22. 任务名称初始化为第一次进入页面的标题值。
|
||||
23. 增加版本更新提示。
|
||||
24. 应要求增加出品方信息。
|
||||
25. 更新chrome版本为113。
|
||||
|
@ -5,9 +5,11 @@ import copy
|
||||
import platform
|
||||
import shutil
|
||||
import string
|
||||
import threading
|
||||
# import undetected_chromedriver as uc
|
||||
from utils import detect_optimizable, download_image, extract_text_from_html, get_output_code, isnotnull, lowercase_tags_in_xpath, myMySQL, new_line, \
|
||||
on_press_creator, on_release_creator, readCode, replace_field_values, send_email, split_text_by_lines, write_to_csv, write_to_excel, write_to_json
|
||||
on_press_creator, on_release_creator, readCode, rename_downloaded_file, replace_field_values, send_email, split_text_by_lines, write_to_csv, write_to_excel, write_to_json
|
||||
from constants import WriteMode, DataWriteMode, GraphOption
|
||||
from myChrome import MyChrome
|
||||
from threading import Thread, Event
|
||||
from PIL import Image
|
||||
@ -30,7 +32,6 @@ from selenium.webdriver.common.action_chains import ActionChains
|
||||
from selenium.webdriver.common.keys import Keys
|
||||
from selenium.webdriver.chrome.options import Options
|
||||
from selenium.webdriver.chrome.service import Service
|
||||
from pynput.keyboard import Key, Listener
|
||||
from datetime import datetime
|
||||
import io # 遇到错误退出时应执行的代码
|
||||
import json
|
||||
@ -75,10 +76,7 @@ class BrowserThread(Thread):
|
||||
def __init__(self, browser_t, id, service, version, event, saveName, config, option):
|
||||
Thread.__init__(self)
|
||||
self.logs = io.StringIO()
|
||||
try:
|
||||
self.log = bool(service["recordLog"])
|
||||
except:
|
||||
self.log = True
|
||||
self.log = bool(service.get("recordLog", True))
|
||||
self.browser = browser_t
|
||||
self.option = option
|
||||
self.config = config
|
||||
@ -86,22 +84,13 @@ class BrowserThread(Thread):
|
||||
self.totalSteps = 0
|
||||
self.id = id
|
||||
self.event = event
|
||||
try:
|
||||
self.saveName = service["saveName"] # 保存文件的名字
|
||||
except:
|
||||
now = datetime.now()
|
||||
# 将时间格式化为精确到秒的字符串
|
||||
self.saveName = now.strftime("%Y_%m_%d_%H_%M_%S")
|
||||
now = datetime.now()
|
||||
self.saveName = service.get("saveName", now.strftime("%Y_%m_%d_%H_%M_%S")) # 保存文件的名字
|
||||
self.OUTPUT = ""
|
||||
self.SAVED = False
|
||||
self.BREAK = False
|
||||
self.CONTINUE = False
|
||||
try:
|
||||
maximizeWindow = service["maximizeWindow"]
|
||||
except:
|
||||
maximizeWindow = 0
|
||||
if maximizeWindow == 1:
|
||||
self.browser.maximize_window()
|
||||
self.browser.maximize_window() if service.get("maximizeWindow") == 1 else ...
|
||||
# 名称设定
|
||||
if saveName != "": # 命令行覆盖保存名称
|
||||
self.saveName = saveName # 保存文件的名字
|
||||
@ -112,19 +101,23 @@ class BrowserThread(Thread):
|
||||
self.print_and_log("Save Name for task ID", id, "is:", self.saveName)
|
||||
if not os.path.exists("Data/Task_" + str(id)):
|
||||
os.mkdir("Data/Task_" + str(id))
|
||||
if not os.path.exists("Data/Task_" + str(id) + "/" + self.saveName):
|
||||
os.mkdir("Data/Task_" + str(id) + "/" +
|
||||
self.saveName) # 创建保存文件夹用来保存截图
|
||||
self.downloadFolder = "Data/Task_" + str(id) + "/" + self.saveName
|
||||
if not os.path.exists(self.downloadFolder):
|
||||
os.mkdir(self.downloadFolder) # 创建保存文件夹用来保存截图和文件
|
||||
if not os.path.exists(self.downloadFolder + "/files"):
|
||||
os.mkdir(self.downloadFolder + "/files")
|
||||
if not os.path.exists(self.downloadFolder + "/images"):
|
||||
os.mkdir(self.downloadFolder + "/images")
|
||||
self.getDataStep = 0
|
||||
self.startSteps = 0
|
||||
try:
|
||||
startFromExit = service["startFromExit"] # 从上次退出的步骤开始
|
||||
if startFromExit == 1:
|
||||
if service.get("startFromExit", 0) == 1:
|
||||
with open("Data/Task_" + str(self.id) + "/" + self.saveName + '_steps.txt', 'r',
|
||||
encoding='utf-8-sig') as file_obj:
|
||||
self.startSteps = int(file_obj.read()) # 读取已执行步数
|
||||
except:
|
||||
pass
|
||||
except Exception as e:
|
||||
self.print_and_log(f"读取steps.txt失败,原因:{str(e)}")
|
||||
|
||||
if self.startSteps != 0:
|
||||
self.print_and_log("此模式下,任务ID", self.id, "将从上次退出的步骤开始执行,之前已采集条数为",
|
||||
self.startSteps, "条。")
|
||||
@ -132,7 +125,7 @@ class BrowserThread(Thread):
|
||||
"will start from the last step, before we already collected", self.startSteps, " items.")
|
||||
else:
|
||||
self.print_and_log("此模式下,任务ID", self.id,
|
||||
"将从头F开始执行,如果需要从上次退出的步骤开始执行,请在保存任务时设置是否从上次保存位置开始执行为“是”。")
|
||||
"将从头开始执行,如果需要从上次退出的步骤开始执行,请在保存任务时设置是否从上次保存位置开始执行为“是”。")
|
||||
self.print_and_log("In this mode, task ID", self.id,
|
||||
"will start from the beginning, if you want to start from the last step, please set the option 'start from the last step' to 'yes' when saving the task.")
|
||||
stealth_path = driver_path[:driver_path.find(
|
||||
@ -140,78 +133,83 @@ class BrowserThread(Thread):
|
||||
with open(stealth_path, 'r') as f:
|
||||
js = f.read()
|
||||
self.print_and_log("Loading stealth.min.js")
|
||||
self.browser.execute_cdp_cmd('Page.addScriptToEvaluateOnNewDocument', {
|
||||
'source': js}) # TMALL 反扒
|
||||
self.browser.execute_cdp_cmd('Page.addScriptToEvaluateOnNewDocument', {'source': js}) # TMALL 反扒
|
||||
self.browser.execute_cdp_cmd("Page.addScriptToEvaluateOnNewDocument", {
|
||||
"source": """
|
||||
Object.defineProperty(navigator, 'webdriver', {
|
||||
get: () => undefined
|
||||
})
|
||||
"""
|
||||
})
|
||||
WebDriverWait(self.browser, 10)
|
||||
self.browser.command_executor._commands["send_command"] = ("POST", '/session/$sessionId/chromium/send_command')
|
||||
path = os.path.join(os.path.abspath("./"), "Data", "Task_" + str(self.id))
|
||||
path = os.path.join(os.path.abspath("./"), "Data", "Task_" + str(self.id), self.saveName, "files")
|
||||
self.paramss = {'cmd': 'Page.setDownloadBehavior', 'params': {'behavior': 'allow', 'downloadPath': path}}
|
||||
|
||||
self.browser.execute("send_command", self.paramss) # 下载地址改变
|
||||
self.browser.execute("send_command", self.paramss) # 下载目录改变
|
||||
self.monitor_event = threading.Event()
|
||||
self.monitor_thread = threading.Thread(target=rename_downloaded_file, args=(path, self.monitor_event)) #path后面的逗号不能省略,是元组固定写法
|
||||
self.monitor_thread.start()
|
||||
# self.browser.get('about:blank')
|
||||
self.procedure = service["graph"] # 程序执行流程
|
||||
try:
|
||||
self.maxViewLength = service["maxViewLength"] # 最大显示长度
|
||||
except:
|
||||
self.maxViewLength = 15
|
||||
try:
|
||||
self.outputFormat = service["outputFormat"] # 输出格式
|
||||
except:
|
||||
self.outputFormat = "csv"
|
||||
try:
|
||||
self.task_version = service["version"] # 任务版本
|
||||
if service["version"] >= "0.3.1": # 0.3.1及以上版本以上的EasySpider兼容从0.3.1版本开始的所有版本
|
||||
pass
|
||||
else: # 0.3.1以下版本的EasySpider不兼容0.3.1及以上版本的EasySpider
|
||||
if service["version"] != version:
|
||||
self.print_and_log("版本不一致,请使用" +
|
||||
service["version"] + "版本的EasySpider运行该任务!")
|
||||
self.print_and_log("Version not match, please use EasySpider " +
|
||||
service["version"] + " to run this task!")
|
||||
self.browser.quit()
|
||||
sys.exit()
|
||||
except: # 0.2.0版本没有version字段,所以直接退出
|
||||
self.maxViewLength = service.get("maxViewLength", 15) # 最大显示长度
|
||||
self.outputFormat = service.get("outputFormat", "csv") # 输出格式
|
||||
self.save_threshold = service.get("saveThreshold", 10) # 保存最低阈值
|
||||
self.dataWriteMode = service.get("dataWriteMode", DataWriteMode.Append.value) # 数据写入模式,1为追加,2为覆盖,3为重命名文件
|
||||
self.task_version = service.get("version", "") # 任务版本
|
||||
|
||||
if not self.task_version:
|
||||
self.print_and_log("版本不一致,请使用v0.2.0版本的EasySpider运行该任务!")
|
||||
self.print_and_log(
|
||||
"Version not match, please use EasySpider v0.2.0 to run this task!")
|
||||
self.print_and_log("Version not match, please use EasySpider v0.2.0 to run this task!")
|
||||
self.browser.quit()
|
||||
sys.exit()
|
||||
try:
|
||||
self.save_threshold = service["saveThreshold"] # 保存最低阈值
|
||||
except:
|
||||
self.save_threshold = 10
|
||||
try:
|
||||
self.links = list(
|
||||
filter(isnotnull, service["links"].split("\n"))) # 要执行的link的列表
|
||||
except:
|
||||
|
||||
if self.task_version >= "0.3.1": # 0.3.1及以上版本以上的EasySpider兼容从0.3.1版本开始的所有版本
|
||||
pass
|
||||
elif self.task_version != version: # 0.3.1以下版本的EasySpider不兼容0.3.1及以上版本的EasySpider
|
||||
self.print_and_log(f"版本不一致,请使用{self.task_version}版本的EasySpider运行该任务!")
|
||||
self.print_and_log(f"Version not match, please use EasySpider {self.task_version} to run this task!")
|
||||
self.browser.quit()
|
||||
sys.exit()
|
||||
|
||||
service_links = service.get("links")
|
||||
if service_links:
|
||||
self.links = list(filter(isnotnull, service_links.split("\n"))) # 要执行的link的列表
|
||||
else:
|
||||
self.links = list(filter(isnotnull, service["url"])) # 要执行的link
|
||||
|
||||
self.OUTPUT = [] # 采集的数据
|
||||
try:
|
||||
self.dataWriteMode = service["dataWriteMode"] # 数据写入模式,1为追加,2为覆盖
|
||||
except:
|
||||
self.dataWriteMode = 1
|
||||
if self.outputFormat == "csv" or self.outputFormat == "txt" or self.outputFormat == "xlsx" or self.outputFormat == "json":
|
||||
if self.dataWriteMode == 2 and os.path.exists("Data/Task_" + str(self.id) + "/" + self.saveName + '.' + self.outputFormat):
|
||||
os.remove("Data/Task_" + str(self.id) + "/" + self.saveName + '.' + self.outputFormat)
|
||||
self.writeMode = 1 # 写入模式,0为新建,1为追加
|
||||
if self.outputFormat == "csv" or self.outputFormat == "txt" or self.outputFormat == "xlsx":
|
||||
if not os.path.exists("Data/Task_" + str(self.id) + "/" + self.saveName + '.' + self.outputFormat):
|
||||
if self.outputFormat in ["csv", "txt", "xlsx", "json"]:
|
||||
if os.path.exists("Data/Task_" + str(self.id) + "/" + self.saveName + '.' + self.outputFormat):
|
||||
if self.dataWriteMode == DataWriteMode.Cover.value:
|
||||
os.remove("Data/Task_" + str(self.id) + "/" + self.saveName + '.' + self.outputFormat)
|
||||
elif self.dataWriteMode == DataWriteMode.Rename.value:
|
||||
i = 2
|
||||
while os.path.exists("Data/Task_" + str(self.id) + "/" + self.saveName + '_' + str(i) + '.' + self.outputFormat):
|
||||
i = i + 1
|
||||
self.saveName = self.saveName + '_' + str(i)
|
||||
self.print_and_log("文件已存在,已重命名为", self.saveName)
|
||||
self.writeMode = WriteMode.Create.value # 写入模式,0为新建,1为追加
|
||||
if self.outputFormat in ['csv', 'txt', 'xlsx']:
|
||||
if not os.path.exists(f"Data/Task_{str(self.id)}/{self.saveName}.{self.outputFormat}"):
|
||||
self.OUTPUT.append([]) # 添加表头
|
||||
self.writeMode = 0
|
||||
self.writeMode = WriteMode.Create.value
|
||||
elif self.outputFormat == "json":
|
||||
self.writeMode = 3 # JSON模式无需判断是否存在文件
|
||||
self.writeMode = WriteMode.Json.value # JSON模式无需判断是否存在文件
|
||||
elif self.outputFormat == "mysql":
|
||||
self.mysql = myMySQL(config["mysql_config_path"])
|
||||
self.mysql.create_table(self.saveName, service["outputParameters"], remove_if_exists=self.dataWriteMode == 2)
|
||||
self.writeMode = 2
|
||||
if self.writeMode == 0:
|
||||
self.mysql.create_table(self.saveName, service["outputParameters"],
|
||||
remove_if_exists=self.dataWriteMode == DataWriteMode.Cover.value)
|
||||
self.writeMode = WriteMode.MySQL.value # MySQL模式
|
||||
|
||||
if self.writeMode == WriteMode.Create.value:
|
||||
self.print_and_log("新建模式|Create Mode")
|
||||
elif self.writeMode == 1:
|
||||
elif self.writeMode == WriteMode.Append.value:
|
||||
self.print_and_log("追加模式|Append Mode")
|
||||
elif self.writeMode == 2:
|
||||
elif self.writeMode == WriteMode.MySQL.value:
|
||||
self.print_and_log("MySQL模式|MySQL Mode")
|
||||
elif self.writeMode == 3:
|
||||
elif self.writeMode == WriteMode.Json.value:
|
||||
self.print_and_log("JSON模式|JSON Mode")
|
||||
|
||||
self.containJudge = service["containJudge"] # 是否含有判断语句
|
||||
self.outputParameters = {}
|
||||
self.service = service
|
||||
@ -224,191 +222,140 @@ class BrowserThread(Thread):
|
||||
if param["name"] not in self.outputParameters.keys():
|
||||
self.outputParameters[param["name"]] = ""
|
||||
self.dataNotFoundKeys[param["name"]] = False
|
||||
try:
|
||||
self.outputParametersTypes.append(param["type"])
|
||||
except:
|
||||
self.outputParametersTypes.append("text")
|
||||
try:
|
||||
self.outputParametersRecord.append(
|
||||
bool(param["recordASField"]))
|
||||
except:
|
||||
self.outputParametersRecord.append(True)
|
||||
self.outputParametersTypes.append(param.get("type", "text"))
|
||||
self.outputParametersRecord.append(bool(param.get("recordASField", True)))
|
||||
# 文件叠加的时候不添加表头
|
||||
if self.outputFormat == "csv" or self.outputFormat == "txt" or self.outputFormat == "xlsx":
|
||||
if self.writeMode == 0:
|
||||
self.OUTPUT[0].append(param["name"])
|
||||
if self.outputFormat in ["csv", "txt", "xlsx"] and self.writeMode == WriteMode.Create.value:
|
||||
self.OUTPUT[0].append(param["name"])
|
||||
self.urlId = 0 # 全局记录变量
|
||||
self.preprocess() # 预处理,优化提取数据流程
|
||||
try:
|
||||
self.inputExcel = service["inputExcel"] # 输入Excel
|
||||
except:
|
||||
self.inputExcel = ""
|
||||
self.inputExcel = service.get("inputExcel", "") # 输入Excel
|
||||
self.readFromExcel() # 读取Excel获得参数值
|
||||
|
||||
# 检测如果没有复杂的操作,优化提取数据流程
|
||||
def preprocess(self):
|
||||
for node in self.procedure:
|
||||
try:
|
||||
iframe = node["parameters"]["iframe"]
|
||||
except:
|
||||
node["parameters"]["iframe"] = False
|
||||
for index_node, node in enumerate(self.procedure):
|
||||
parameters: dict = node["parameters"]
|
||||
iframe = parameters.get('iframe')
|
||||
option = node["option"]
|
||||
|
||||
try:
|
||||
node["parameters"]["xpath"] = lowercase_tags_in_xpath(
|
||||
node["parameters"]["xpath"])
|
||||
except:
|
||||
pass
|
||||
try:
|
||||
node["parameters"]["waitElementIframeIndex"] = int(
|
||||
node["parameters"]["waitElementIframeIndex"])
|
||||
except:
|
||||
node["parameters"]["waitElement"] = ""
|
||||
node["parameters"]["waitElementTime"] = 10
|
||||
node["parameters"]["waitElementIframeIndex"] = 0
|
||||
if node["option"] == 1: # 打开网页操作
|
||||
try:
|
||||
cookies = node["parameters"]["cookies"]
|
||||
except:
|
||||
node["parameters"]["cookies"] = ""
|
||||
elif node["option"] == 2: # 点击操作
|
||||
try:
|
||||
alertHandleType = node["parameters"]["alertHandleType"]
|
||||
except:
|
||||
node["parameters"]["alertHandleType"] = 0
|
||||
if node["parameters"]["useLoop"]:
|
||||
parameters["iframe"] = False if not iframe else parameters.get('iframe', False)
|
||||
if parameters.get("xpath"):
|
||||
parameters["xpath"] = lowercase_tags_in_xpath(parameters["xpath"])
|
||||
|
||||
if parameters.get("waitElementIframeIndex"):
|
||||
parameters["waitElementIframeIndex"] = int(parameters["waitElementIframeIndex"])
|
||||
else:
|
||||
parameters["waitElement"] = ""
|
||||
parameters["waitElementTime"] = 10
|
||||
parameters["waitElementIframeIndex"] = 0
|
||||
|
||||
if option == GraphOption.Get.value: # 打开网页操作
|
||||
parameters["cookies"] = parameters.get("cookies", "")
|
||||
elif option == GraphOption.Click.value: # 点击操作
|
||||
parameters["alertHandleType"] = parameters.get("alertHandleType", 0)
|
||||
if parameters.get("useLoop"):
|
||||
if self.task_version <= "0.3.5":
|
||||
# 0.3.5及以下版本的EasySpider下的循环点击不支持相对XPath
|
||||
node["parameters"]["xpath"] = ""
|
||||
self.print_and_log("您的任务版本号为" + self.task_version +
|
||||
",循环点击不支持相对XPath写法,已自动切换为纯循环的XPath")
|
||||
elif node["option"] == 3: # 提取数据操作
|
||||
node["parameters"]["recordASField"] = 0
|
||||
try:
|
||||
params = node["parameters"]["params"]
|
||||
except:
|
||||
node["parameters"]["params"] = node["parameters"]["paras"] # 兼容0.5.0及以下版本的EasySpider
|
||||
params = node["parameters"]["params"]
|
||||
try:
|
||||
clear = node["parameters"]["clear"]
|
||||
except:
|
||||
node["parameters"]["clear"] = 0
|
||||
try:
|
||||
newLine = node["parameters"]["newLine"]
|
||||
except:
|
||||
node["parameters"]["newLine"] = 1
|
||||
parameters["xpath"] = ""
|
||||
self.print_and_log(f"您的任务版本号为{self.task_version},循环点击不支持相对XPath写法,已自动切换为纯循环的XPath")
|
||||
elif option == GraphOption.Extract.value: # 提取数据操作
|
||||
parameters["recordASField"] = 0
|
||||
parameters["params"] = parameters.get("params", parameters.get("paras")) # 兼容0.5.0及以下版本的EasySpider
|
||||
parameters["clear"] = parameters.get("clear", 0)
|
||||
parameters["newLine"] = parameters.get("newLine", 1)
|
||||
|
||||
params = parameters["params"]
|
||||
for param in params:
|
||||
try:
|
||||
iframe = param["iframe"]
|
||||
except:
|
||||
param["iframe"] = False
|
||||
try:
|
||||
param["iframe"] = param.get("iframe", False)
|
||||
|
||||
if param.get("relativeXPath"):
|
||||
param["relativeXPath"] = lowercase_tags_in_xpath(param["relativeXPath"])
|
||||
except:
|
||||
pass
|
||||
try:
|
||||
node["parameters"]["recordASField"] = param["recordASField"]
|
||||
except:
|
||||
node["parameters"]["recordASField"] = 1
|
||||
try:
|
||||
splitLine = int(param["splitLine"])
|
||||
except:
|
||||
param["splitLine"] = 0
|
||||
if param["contentType"] == 8:
|
||||
self.print_and_log(
|
||||
"默认的ddddocr识别功能如果觉得不好用,可以自行修改源码get_content函数->contentType == 8的位置换成自己想要的OCR模型然后自己编译运行;或者可以先设置采集内容类型为“元素截图”把图片保存下来,然后用自定义操作调用自己写的程序,程序的功能是读取这个最新生成的图片,然后用好用的模型,如PaddleOCR把图片识别出来,然后把返回值返回给程序作为参数输出。")
|
||||
self.print_and_log(
|
||||
"If you think the default ddddocr function is not good enough, you can modify the source code get_content function -> contentType == 8 position to your own OCR model and then compile and run it; or you can first set the content type of the crawler to \"Element Screenshot\" to save the picture, and then call your own program with custom operations. The function of the program is to read the latest generated picture, then use a good model, such as PaddleOCR to recognize the picture, and then return the return value as a parameter output to the program.")
|
||||
|
||||
parameters["recordASField"] = param.get("recordASField", 1)
|
||||
|
||||
param["splitLine"] = 0 if not param.get("splitLine") else param.get("splitLine")
|
||||
|
||||
if param.get("contentType") == 8:
|
||||
self.print_and_log("默认的ddddocr识别功能如果觉得不好用,可以自行修改源码get_content函数->contentType =="
|
||||
"8的位置换成自己想要的OCR模型然后自己编译运行;或者可以先设置采集内容类型为“元素截图”把图片"
|
||||
"保存下来,然后用自定义操作调用自己写的程序,程序的功能是读取这个最新生成的图片,然后用好用"
|
||||
"的模型,如PaddleOCR把图片识别出来,然后把返回值返回给程序作为参数输出。")
|
||||
self.print_and_log("If you think the default ddddocr function is not good enough, you can "
|
||||
"modify the source code get_content function -> contentType == 8 position "
|
||||
"to your own OCR model and then compile and run it; or you can first set "
|
||||
"the content type of the crawler to \"Element Screenshot\" to save the "
|
||||
"picture, and then call your own program with custom operations. The "
|
||||
"function of the program is to read the latest generated picture, then use "
|
||||
"a good model, such as PaddleOCR to recognize the picture, and then return "
|
||||
"the return value as a parameter output to the program.")
|
||||
param["optimizable"] = detect_optimizable(param)
|
||||
elif node["option"] == 4: # 输入文字
|
||||
try:
|
||||
index = node["parameters"]["index"] # 索引值
|
||||
except:
|
||||
node["parameters"]["index"] = 0
|
||||
elif node["option"] == 5: # 自定义操作
|
||||
try:
|
||||
clear = node["parameters"]["clear"]
|
||||
except:
|
||||
node["parameters"]["clear"] = 0
|
||||
try:
|
||||
newLine = node["parameters"]["newLine"]
|
||||
except:
|
||||
node["parameters"]["newLine"] = 1
|
||||
elif node["option"] == 7: # 移动到元素
|
||||
if node["parameters"]["useLoop"]:
|
||||
if self.task_version <= "0.3.5":
|
||||
# 0.3.5及以下版本的EasySpider下的循环点击不支持相对XPath
|
||||
node["parameters"]["xpath"] = ""
|
||||
self.print_and_log("您的任务版本号为" + self.task_version +
|
||||
",循环点击不支持相对XPath写法,已自动切换为纯循环的XPath")
|
||||
elif node["option"] == 8: # 循环操作
|
||||
try:
|
||||
exitElement = node["parameters"]["exitElement"]
|
||||
if exitElement == "":
|
||||
node["parameters"]["exitElement"] = "//body"
|
||||
except:
|
||||
node["parameters"]["exitElement"] = "//body"
|
||||
node["parameters"]["quickExtractable"] = False # 是否可以快速提取
|
||||
try:
|
||||
skipCount = node["parameters"]["skipCount"]
|
||||
except:
|
||||
node["parameters"]["skipCount"] = 0
|
||||
elif option == GraphOption.Input.value: # 输入文字
|
||||
parameters['index'] = parameters.get('index', 0)
|
||||
elif option == GraphOption.Custom.value: # 自定义操作
|
||||
parameters['clear'] = parameters.get('clear', 0)
|
||||
parameters['newLine'] = parameters.get('newLine', 1)
|
||||
elif option == GraphOption.Move.value: # 移动到元素
|
||||
if parameters.get('useLoop'):
|
||||
if self.task_version <= "0.3.5": # 0.3.5及以下版本的EasySpider下的循环点击不支持相对XPath
|
||||
parameters["xpath"] = ""
|
||||
self.print_and_log(f"您的任务版本号为{self.task_version},循环点击不支持相对XPath写法,已自动切换为纯循环的XPath")
|
||||
elif option == GraphOption.Loop.value: # 循环操作
|
||||
parameters['exitElement'] = "//body" if not parameters.get('exitElement') or parameters.get('exitElement') == "" else parameters.get('exitElement')
|
||||
parameters["quickExtractable"] = False # 是否可以快速提取
|
||||
parameters['skipCount'] = parameters.get('skipCount', 0)
|
||||
|
||||
# 如果(不)固定元素列表循环中只有一个提取数据操作,且提取数据操作的提取内容为元素截图,那么可以快速提取
|
||||
if len(node["sequence"]) == 1 and self.procedure[node["sequence"][0]]["option"] == 3 and (int(node["parameters"]["loopType"]) == 1 or int(node["parameters"]["loopType"]) == 2):
|
||||
try:
|
||||
params = self.procedure[node["sequence"][0]]["parameters"]["params"]
|
||||
except:
|
||||
params = self.procedure[node["sequence"][0]]["parameters"]["paras"] # 兼容0.5.0及以下版本的EasySpider
|
||||
try:
|
||||
waitElement = self.procedure[node["sequence"][0]]["parameters"]["waitElement"]
|
||||
except:
|
||||
waitElement = ""
|
||||
if node["parameters"]["iframe"]:
|
||||
node["parameters"]["quickExtractable"] = False # 如果是iframe,那么不可以快速提取
|
||||
if len(node["sequence"]) == 1 and self.procedure[node["sequence"][0]]["option"] == 3 \
|
||||
and (int(node["parameters"]["loopType"]) == 1 or int(node["parameters"]["loopType"]) == 2):
|
||||
params = self.procedure[node["sequence"][0]].get("parameters").get("params")
|
||||
if not params:
|
||||
params = self.procedure[node["sequence"][0]]["parameters"]["paras"] # 兼容0.5.0及以下版本的EasySpider
|
||||
|
||||
waitElement = self.procedure[node["sequence"][0]]["parameters"].get("waitElement", "")
|
||||
|
||||
if parameters["iframe"]:
|
||||
parameters["quickExtractable"] = False # 如果是iframe,那么不可以快速提取
|
||||
else:
|
||||
node["parameters"]["quickExtractable"] = True # 先假设可以快速提取
|
||||
if node["parameters"]["skipCount"] > 0:
|
||||
node["parameters"]["quickExtractable"] = False # 如果有跳过的元素,那么不可以快速提取
|
||||
parameters["quickExtractable"] = True # 先假设可以快速提取
|
||||
|
||||
if parameters["skipCount"] > 0:
|
||||
parameters["quickExtractable"] = False # 如果有跳过的元素,那么不可以快速提取
|
||||
|
||||
for param in params:
|
||||
optimizable = detect_optimizable(param, ignoreWaitElement=False, waitElement=waitElement)
|
||||
try:
|
||||
iframe = param["iframe"]
|
||||
except:
|
||||
param["iframe"] = False
|
||||
if param["iframe"] and not param["relative"]: # 如果是iframe,那么不可以快速提取
|
||||
param['iframe'] = param.get('iframe', False)
|
||||
if param["iframe"] and not param["relative"]: # 如果是iframe,那么不可以快速提取
|
||||
optimizable = False
|
||||
if not optimizable: # 如果有一个不满足优化条件,那么就不能快速提取
|
||||
node["parameters"]["quickExtractable"] = False
|
||||
if not optimizable: # 如果有一个不满足优化条件,那么就不能快速提取
|
||||
parameters["quickExtractable"] = False
|
||||
break
|
||||
if node["parameters"]["quickExtractable"]:
|
||||
self.print_and_log("循环操作<" + node["title"] + ">可以快速提取数据")
|
||||
self.print_and_log("Loop operation <" + node["title"] + "> can extract data quickly")
|
||||
try:
|
||||
node["parameters"]["clear"] = self.procedure[node["sequence"][0]]["parameters"]["clear"]
|
||||
except:
|
||||
node["parameters"]["clear"] = 0
|
||||
try:
|
||||
node["parameters"]["newLine"] = self.procedure[node["sequence"][0]]["parameters"]["newLine"]
|
||||
except:
|
||||
node["parameters"]["newLine"] = 1
|
||||
if int(node["parameters"]["loopType"]) == 1: # 不固定元素列表
|
||||
|
||||
if parameters["quickExtractable"]:
|
||||
self.print_and_log(f"循环操作<{node['title']}>可以快速提取数据")
|
||||
self.print_and_log(f"Loop operation <{node['title']}> can extract data quickly")
|
||||
parameters["clear"] = self.procedure[node["sequence"][0]]["parameters"].get("clear", 0)
|
||||
parameters["newLine"] = self.procedure[node["sequence"][0]]["parameters"].get("newLine", 1)
|
||||
|
||||
if int(node["parameters"]["loopType"]) == 1: # 不固定元素列表
|
||||
node["parameters"]["baseXPath"] = node["parameters"]["xpath"]
|
||||
elif int(node["parameters"]["loopType"]) == 2: # 固定元素列表
|
||||
elif int(node["parameters"]["loopType"]) == 2: # 固定元素列表
|
||||
node["parameters"]["baseXPath"] = node["parameters"]["pathList"]
|
||||
node["parameters"]["quickParams"] = []
|
||||
for param in params:
|
||||
content_type = ""
|
||||
if param["relativeXPath"].find("/@href") >= 0 or param["relativeXPath"].find("/text()") >= 0 or param["relativeXPath"].find(
|
||||
"::text()") >= 0:
|
||||
if param["relativeXPath"].find("/@href") >= 0 or param["relativeXPath"].find("/text()") >= 0 \
|
||||
or param["relativeXPath"].find("::text()") >= 0:
|
||||
content_type = ""
|
||||
elif param["nodeType"] == 2:
|
||||
content_type = "//@href"
|
||||
elif param["nodeType"] == 4: # 图片链接
|
||||
elif param["nodeType"] == 4: # 图片链接
|
||||
content_type = "//@src"
|
||||
elif param["contentType"] == 1:
|
||||
content_type = "/text()"
|
||||
elif param["contentType"] == 0:
|
||||
content_type = "//text()"
|
||||
if param["relative"]: # 如果是相对XPath
|
||||
if param["relative"]: # 如果是相对XPath
|
||||
xpath = "." + param["relativeXPath"] + content_type
|
||||
else:
|
||||
xpath = param["relativeXPath"] + content_type
|
||||
@ -422,6 +369,7 @@ class BrowserThread(Thread):
|
||||
"nodeType": param["nodeType"],
|
||||
"default": param["default"],
|
||||
})
|
||||
self.procedure[index_node]["parameters"] = parameters
|
||||
self.print_and_log("预处理完成|Preprocess completed")
|
||||
|
||||
def readFromExcel(self):
|
||||
@ -521,7 +469,7 @@ class BrowserThread(Thread):
|
||||
"/", len(self.links))
|
||||
self.executeNode(0)
|
||||
self.urlId = self.urlId + 1
|
||||
files = os.listdir("Data/Task_" + str(self.id) + "/" + self.saveName)
|
||||
# files = os.listdir("Data/Task_" + str(self.id) + "/" + self.saveName)
|
||||
# 如果目录为空,则删除该目录
|
||||
# if not files:
|
||||
# os.rmdir("Data/Task_" + str(self.id) + "/" + self.saveName)
|
||||
@ -538,12 +486,16 @@ class BrowserThread(Thread):
|
||||
self.print_and_log(f"任务执行完毕,将在{quitWaitTime}秒后自动退出浏览器并清理临时用户目录,等待时间可在保存任务对话框中设置。")
|
||||
self.print_and_log(f"The task is completed, the browser will exit automatically and the temporary user directory will be cleaned up after {quitWaitTime} seconds, the waiting time can be set in the save task dialog.")
|
||||
time.sleep(quitWaitTime)
|
||||
self.browser.quit()
|
||||
try:
|
||||
self.browser.quit()
|
||||
except:
|
||||
pass
|
||||
self.print_and_log("正在清理临时用户目录……|Cleaning up temporary user directory...")
|
||||
try:
|
||||
shutil.rmtree(self.option["tmp_user_data_folder"])
|
||||
except:
|
||||
pass
|
||||
self.monitor_event.set()
|
||||
self.print_and_log("清理完成!|Clean up completed!")
|
||||
self.print_and_log("您现在可以安全的关闭此窗口了。|You can safely close this window now.")
|
||||
|
||||
@ -753,28 +705,32 @@ class BrowserThread(Thread):
|
||||
self.browser.set_script_timeout(max_wait_time)
|
||||
try:
|
||||
output = self.browser.execute_script(code)
|
||||
except:
|
||||
except Exception as e:
|
||||
output = ""
|
||||
self.recordLog("JavaScript execution failed")
|
||||
self.print_and_log("执行下面的代码时出错:" + code, ",错误为:", str(e))
|
||||
self.print_and_log("Error executing the following code:" + code, ", error is:", str(e))
|
||||
elif int(codeMode) == 2:
|
||||
self.recordLog("Execute JavaScript for element:" + code)
|
||||
self.recordLog("对元素执行JavaScript:" + code)
|
||||
self.browser.set_script_timeout(max_wait_time)
|
||||
try:
|
||||
output = self.browser.execute_script(code, element)
|
||||
except:
|
||||
except Exception as e:
|
||||
output = ""
|
||||
self.recordLog("JavaScript execution failed")
|
||||
self.print_and_log("执行下面的代码时出错:" + code, ",错误为:", str(e))
|
||||
self.print_and_log("Error executing the following code:" + code, ", error is:", str(e))
|
||||
elif int(codeMode) == 5:
|
||||
try:
|
||||
code = readCode(code)
|
||||
# global_namespace = globals().copy()
|
||||
# global_namespace["self"] = self
|
||||
output = exec(code)
|
||||
self.recordLog("执行下面的代码:" + code)
|
||||
self.recordLog("Execute the following code:" + code)
|
||||
except Exception as e:
|
||||
self.print_and_log("执行下面的代码时出错:" + code, ",错误为:", e)
|
||||
self.print_and_log("执行下面的代码时出错:" + code, ",错误为:", str(e))
|
||||
self.print_and_log("Error executing the following code:" +
|
||||
code, ", error is:", e)
|
||||
code, ", error is:", str(e))
|
||||
elif int(codeMode) == 6:
|
||||
try:
|
||||
code = readCode(code)
|
||||
@ -847,6 +803,23 @@ class BrowserThread(Thread):
|
||||
self.print_and_log("根据设置的自定义操作,任务已刷新页面|Task refreshed page according to custom operation")
|
||||
elif codeMode == 9: # 发送邮件
|
||||
send_email(node["parameters"]["emailConfig"])
|
||||
elif codeMode == 10: # 清空所有字段值
|
||||
self.clearOutputParameters()
|
||||
elif codeMode == 11: # 生成新的数据行
|
||||
line = new_line(self.outputParameters,
|
||||
self.maxViewLength, self.outputParametersRecord)
|
||||
self.OUTPUT.append(line)
|
||||
elif codeMode == 12: # 退出程序
|
||||
self.print_and_log("根据设置的自定义操作,任务已退出|Task exited according to custom operation")
|
||||
self.saveData(exit=True)
|
||||
self.browser.quit()
|
||||
self.print_and_log("正在清理临时用户目录……|Cleaning up temporary user directory...")
|
||||
try:
|
||||
shutil.rmtree(self.option["tmp_user_data_folder"])
|
||||
except:
|
||||
pass
|
||||
self.print_and_log("清理完成!|Clean up completed!")
|
||||
os._exit(0)
|
||||
else: # 0 1 5 6
|
||||
output = self.execute_code(
|
||||
codeMode, code, max_wait_time, iframe=params["iframe"])
|
||||
@ -1106,7 +1079,25 @@ class BrowserThread(Thread):
|
||||
self.recordLog(
|
||||
"判断条件内所有条件分支的条件都不满足|None of the conditions in the judgment condition are met")
|
||||
|
||||
def handleHistory(self, node, xpath, thisHistoryURL, thisHistoryLength, index, element=None, elements=None):
|
||||
def handleHistory(self, node, xpath, thisHandle, thisHistoryURL, thisHistoryLength, index, element=None, elements=None):
|
||||
try:
|
||||
changed_handle = self.browser.current_window_handle != thisHandle
|
||||
except: # 如果网页被意外关闭了的情况下
|
||||
self.browser.switch_to.window(
|
||||
self.browser.window_handles[-1])
|
||||
changed_handle = self.browser.window_handles[-1] != thisHandle
|
||||
if changed_handle: # 如果执行完一次循环之后标签页的位置发生了变化
|
||||
try:
|
||||
while True: # 一直关闭窗口直到当前标签页
|
||||
self.browser.close() # 关闭使用完的标签页
|
||||
self.browser.switch_to.window(
|
||||
self.browser.window_handles[-1])
|
||||
if self.browser.current_window_handle == thisHandle:
|
||||
break
|
||||
except Exception as e:
|
||||
self.print_and_log("关闭标签页发生错误:", e)
|
||||
self.print_and_log(
|
||||
"Error occurred while closing tab: ", e)
|
||||
if self.history["index"] != thisHistoryLength and self.history["handle"] == self.browser.current_window_handle: # 如果执行完一次循环之后历史记录发生了变化,注意当前页面的判断
|
||||
difference = thisHistoryLength - self.history["index"] # 计算历史记录变化差值
|
||||
self.browser.execute_script('history.go(' + str(difference) + ')') # 回退历史记录
|
||||
@ -1132,12 +1123,13 @@ class BrowserThread(Thread):
|
||||
if self.browser.current_url == thisHistoryURL or ti > thisHistoryLength: # 如果执行完一次循环之后网址发生了变化
|
||||
break
|
||||
time.sleep(2)
|
||||
if element == None: # 不固定元素列表
|
||||
element = self.browser.find_elements(By.XPATH, xpath, iframe=node["parameters"]["iframe"])
|
||||
else: # 固定元素列表
|
||||
element = self.browser.find_element(By.XPATH, xpath, iframe=node["parameters"]["iframe"])
|
||||
# if index > 0:
|
||||
# index -= 1 # 如果是data:开头的网址,就要重试一次
|
||||
if xpath != "":
|
||||
if element == None: # 不固定元素列表
|
||||
element = self.browser.find_elements(By.XPATH, xpath, iframe=node["parameters"]["iframe"])
|
||||
else: # 固定元素列表
|
||||
element = self.browser.find_element(By.XPATH, xpath, iframe=node["parameters"]["iframe"])
|
||||
# if index > 0:
|
||||
# index -= 1 # 如果是data:开头的网址,就要重试一次
|
||||
else:
|
||||
if element == None:
|
||||
element = elements
|
||||
@ -1156,6 +1148,14 @@ class BrowserThread(Thread):
|
||||
self.history["handle"] = thisHandle
|
||||
thisHistoryURL = self.browser.current_url
|
||||
# 快速提取处理
|
||||
# start = time.time()
|
||||
try:
|
||||
tree = html.fromstring(self.browser.page_source)
|
||||
except Exception as e:
|
||||
self.print_and_log("解析页面时出错,将切换普通提取模式|Error parsing page, will switch to normal extraction mode")
|
||||
node["parameters"]["quickExtractable"] = False
|
||||
# end = time.time()
|
||||
# print("解析页面秒数:", end - start)
|
||||
if node["parameters"]["quickExtractable"]:
|
||||
self.browser.switch_to.default_content() # 切换到主页面
|
||||
tree = html.fromstring(self.browser.page_source)
|
||||
@ -1321,25 +1321,7 @@ class BrowserThread(Thread):
|
||||
if self.BREAK:
|
||||
self.BREAK = False
|
||||
break
|
||||
try:
|
||||
changed_handle = self.browser.current_window_handle != thisHandle
|
||||
except: # 如果网页被意外关闭了的情况下
|
||||
self.browser.switch_to.window(
|
||||
self.browser.window_handles[-1])
|
||||
changed_handle = self.browser.window_handles[-1] != thisHandle
|
||||
if changed_handle: # 如果执行完一次循环之后标签页的位置发生了变化
|
||||
try:
|
||||
while True: # 一直关闭窗口直到当前标签页
|
||||
self.browser.close() # 关闭使用完的标签页
|
||||
self.browser.switch_to.window(
|
||||
self.browser.window_handles[-1])
|
||||
if self.browser.current_window_handle == thisHandle:
|
||||
break
|
||||
except Exception as e:
|
||||
self.print_and_log("关闭标签页发生错误:", e)
|
||||
self.print_and_log(
|
||||
"Error occurred while closing tab: ", e)
|
||||
index, elements = self.handleHistory(node, xpath, thisHistoryURL, thisHistoryLength, index, elements=elements)
|
||||
index, elements = self.handleHistory(node, xpath, thisHandle, thisHistoryURL, thisHistoryLength, index, elements=elements)
|
||||
if int(node["parameters"]["breakMode"]) > 0: # 如果设置了退出循环的脚本条件
|
||||
output = self.execute_code(int(
|
||||
node["parameters"]["breakMode"]) - 1, node["parameters"]["breakCode"],
|
||||
@ -1381,25 +1363,7 @@ class BrowserThread(Thread):
|
||||
if self.BREAK:
|
||||
self.BREAK = False
|
||||
break
|
||||
try:
|
||||
changed_handle = self.browser.current_window_handle != thisHandle
|
||||
except: # 如果网页被意外关闭了的情况下
|
||||
self.browser.switch_to.window(
|
||||
self.browser.window_handles[-1])
|
||||
changed_handle = self.browser.window_handles[-1] != thisHandle
|
||||
if changed_handle: # 如果执行完一次循环之后标签页的位置发生了变化
|
||||
try:
|
||||
while True: # 一直关闭窗口直到当前标签页
|
||||
self.browser.close() # 关闭使用完的标签页
|
||||
self.browser.switch_to.window(
|
||||
self.browser.window_handles[-1])
|
||||
if self.browser.current_window_handle == thisHandle:
|
||||
break
|
||||
except Exception as e:
|
||||
self.print_and_log("关闭标签页发生错误:", e)
|
||||
self.print_and_log(
|
||||
"Error occurred while closing tab: ", e)
|
||||
index, element = self.handleHistory(node, path, thisHistoryURL, thisHistoryLength, index, element=element)
|
||||
index, element = self.handleHistory(node, path, thisHandle, thisHistoryURL, thisHistoryLength, index, element=element)
|
||||
except NoSuchElementException:
|
||||
self.print_and_log("Loop element not found: ", path)
|
||||
self.print_and_log("找不到循环元素:", path)
|
||||
@ -1447,6 +1411,7 @@ class BrowserThread(Thread):
|
||||
code = get_output_code(output)
|
||||
if code <= 0:
|
||||
break
|
||||
index, _ = self.handleHistory(node, "", thisHandle, thisHistoryURL, thisHistoryLength, index)
|
||||
elif int(node["parameters"]["loopType"]) == 4: # 固定网址列表
|
||||
# tempList = node["parameters"]["textList"].split("\r\n")
|
||||
urlList = list(
|
||||
@ -1696,8 +1661,11 @@ class BrowserThread(Thread):
|
||||
try:
|
||||
actions = ActionChains(self.browser) # 实例化一个action对象
|
||||
if newTab == 1: # 在新标签页打开
|
||||
# Ctrl + Click
|
||||
actions.key_down(Keys.CONTROL).click(element).key_up(Keys.CONTROL).perform()
|
||||
if sys.platform == "darwin": # Mac
|
||||
actions.key_down(Keys.COMMAND).click(element).key_up(Keys.COMMAND).perform()
|
||||
else:
|
||||
# Ctrl + Click
|
||||
actions.key_down(Keys.CONTROL).click(element).key_up(Keys.CONTROL).perform()
|
||||
else:
|
||||
actions.click(element).perform()
|
||||
except Exception as e:
|
||||
@ -1715,6 +1683,21 @@ class BrowserThread(Thread):
|
||||
script = 'var result = document.evaluate(`' + path + \
|
||||
'`, document, null, XPathResult.ANY_TYPE, null);for(let i=0;i<arguments[0];i++){result.iterateNext();} result.iterateNext().click();'
|
||||
self.browser.execute_script(script, str(index)) # 用js的点击方法
|
||||
elif click_way == 2: # 双击
|
||||
try:
|
||||
actions = ActionChains(self.browser) # 实例化一个action对象
|
||||
actions.double_click(element).perform()
|
||||
except Exception as e:
|
||||
self.browser.execute_script("arguments[0].scrollIntoView();", element)
|
||||
try:
|
||||
actions = ActionChains(self.browser) # 实例化一个action对象
|
||||
actions.double_click(element).perform()
|
||||
except Exception as e:
|
||||
self.print_and_log(f"Selenium双击元素{path}失败,将尝试使用JavaScript双击")
|
||||
self.print_and_log(f"Failed to double click element {path} with Selenium, will try to double click with JavaScript")
|
||||
script = 'var result = document.evaluate(`' + path + \
|
||||
'`, document, null, XPathResult.ANY_TYPE, null);for(let i=0;i<arguments[0];i++){result.iterateNext();} result.iterateNext().click();'
|
||||
self.browser.execute_script(script, str(index)) # 用js的点击方法
|
||||
self.recordLog("点击元素|Click element: " + path)
|
||||
except TimeoutException:
|
||||
self.print_and_log(
|
||||
@ -1797,7 +1780,6 @@ class BrowserThread(Thread):
|
||||
self.print_and_log("History Length Error")
|
||||
self.history["index"] = 0
|
||||
self.scrollDown(param) # 根据参数配置向下滚动
|
||||
# rt.end()
|
||||
|
||||
def get_content(self, p, element):
|
||||
content = ""
|
||||
@ -1824,7 +1806,7 @@ class BrowserThread(Thread):
|
||||
downloadPic = 0
|
||||
if downloadPic == 1:
|
||||
download_image(self, content, "Data/Task_" +
|
||||
str(self.id) + "/" + self.saveName + "/", element)
|
||||
str(self.id) + "/" + self.saveName + "/images", element)
|
||||
else: # 普通节点
|
||||
if p["splitLine"] == 1:
|
||||
text = extract_text_from_html(element.get_attribute('outerHTML'))
|
||||
@ -1853,7 +1835,7 @@ class BrowserThread(Thread):
|
||||
downloadPic = 0
|
||||
if downloadPic == 1:
|
||||
download_image(self, content, "Data/Task_" +
|
||||
str(self.id) + "/" + self.saveName + "/", element)
|
||||
str(self.id) + "/" + self.saveName + "/images", element)
|
||||
else:
|
||||
command = 'var arr = [];\
|
||||
var content = arguments[0];\
|
||||
@ -1965,6 +1947,8 @@ class BrowserThread(Thread):
|
||||
content = element.get_attribute(attribute_name)
|
||||
except:
|
||||
content = ""
|
||||
elif p["contentType"] == 15: # 常量值
|
||||
content = p["JS"]
|
||||
if content == None:
|
||||
content = ""
|
||||
return content
|
||||
@ -2208,7 +2192,9 @@ if __name__ == '__main__':
|
||||
"server_address": "http://localhost:8074",
|
||||
"keyboard": True, # 是否监听键盘输入
|
||||
"pause_key": "p", # 暂停键
|
||||
"version": "0.6.0",
|
||||
"version": "0.6.3",
|
||||
"docker_driver": "",
|
||||
"user_folder": "",
|
||||
}
|
||||
c = Config(config)
|
||||
print(c)
|
||||
@ -2283,7 +2269,9 @@ if __name__ == '__main__':
|
||||
|
||||
options.add_argument(
|
||||
"--disable-blink-features=AutomationControlled") # TMALL 反扒
|
||||
|
||||
# 阻止http -> https的重定向
|
||||
options.add_argument("--disable-features=CrossSiteDocumentBlockingIfIsolating,CrossSiteDocumentBlockingAlways,IsolateOrigins,site-per-process")
|
||||
options.add_argument("--disable-web-security") # 禁用同源策略
|
||||
options.add_argument('-ignore-certificate-errors')
|
||||
options.add_argument('-ignore -ssl-errors')
|
||||
|
||||
@ -2302,35 +2290,43 @@ if __name__ == '__main__':
|
||||
os.mkdir(tmp_user_folder_parent)
|
||||
characters = string.ascii_letters + string.digits
|
||||
for i in range(len(c.ids)):
|
||||
id = c.ids[i]
|
||||
# 从字符集中随机选择字符构成字符串
|
||||
random_string = ''.join(random.choice(characters) for i in range(10))
|
||||
tmp_user_data_folder = os.path.join(tmp_user_folder_parent, "user_data_" + str(id) + "_" + str(time.time()).replace(".","") + "_" + random_string)
|
||||
tmp_options[i]["tmp_user_data_folder"] = tmp_user_data_folder
|
||||
if os.path.exists(tmp_user_data_folder):
|
||||
try:
|
||||
shutil.rmtree(tmp_user_data_folder)
|
||||
except:
|
||||
pass
|
||||
print(f"Copying user data folder to: {tmp_user_data_folder}, please wait...")
|
||||
print(f"正在复制用户信息目录到: {tmp_user_data_folder},请稍等...")
|
||||
if os.path.exists(absolute_user_data_folder):
|
||||
try:
|
||||
shutil.copytree(absolute_user_data_folder, tmp_user_data_folder)
|
||||
print("User data folder copied successfully, if you exit the program before it finishes, please delete the temporary user data folder manually.")
|
||||
print("用户信息目录复制成功,如果程序在运行过程中被手动退出,请手动删除临时用户信息目录。")
|
||||
except:
|
||||
tmp_user_data_folder = absolute_user_data_folder
|
||||
print("Copy user data folder failed, use the original folder.")
|
||||
print("复制用户信息目录失败,使用原始目录。")
|
||||
else:
|
||||
tmp_user_data_folder = absolute_user_data_folder
|
||||
print("Cannot find user data folder, create a new folder.")
|
||||
print("未找到用户信息目录,创建新目录。")
|
||||
options = tmp_options[i]["options"]
|
||||
options.add_argument(
|
||||
f'--user-data-dir={tmp_user_data_folder}') # TMALL 反扒
|
||||
options.add_argument("--profile-directory=Default")
|
||||
if c.user_folder == "":
|
||||
id = c.ids[i]
|
||||
# 从字符集中随机选择字符构成字符串
|
||||
random_string = ''.join(random.choice(characters) for i in range(10))
|
||||
tmp_user_data_folder = os.path.join(tmp_user_folder_parent, "user_data_" + str(id) + "_" + str(time.time()).replace(".","") + "_" + random_string)
|
||||
tmp_options[i]["tmp_user_data_folder"] = tmp_user_data_folder
|
||||
if os.path.exists(tmp_user_data_folder):
|
||||
try:
|
||||
shutil.rmtree(tmp_user_data_folder)
|
||||
except:
|
||||
pass
|
||||
print(f"Copying user data folder to: {tmp_user_data_folder}, please wait...")
|
||||
print(f"正在复制用户信息目录到: {tmp_user_data_folder},请稍等...")
|
||||
if os.path.exists(absolute_user_data_folder):
|
||||
try:
|
||||
shutil.copytree(absolute_user_data_folder, tmp_user_data_folder)
|
||||
print("User data folder copied successfully, if you exit the program before it finishes, please delete the temporary user data folder manually.")
|
||||
print("用户信息目录复制成功,如果程序在运行过程中被手动退出,请手动删除临时用户信息目录。")
|
||||
except:
|
||||
tmp_user_data_folder = absolute_user_data_folder
|
||||
print("Copy user data folder failed, use the original folder.")
|
||||
print("复制用户信息目录失败,使用原始目录。")
|
||||
else:
|
||||
tmp_user_data_folder = absolute_user_data_folder
|
||||
print("Cannot find user data folder, create a new folder.")
|
||||
print("未找到用户信息目录,创建新目录。")
|
||||
options.add_argument(
|
||||
f'--user-data-dir={tmp_user_data_folder}') # TMALL 反扒
|
||||
print(f"Use local user data folder: {tmp_user_data_folder}")
|
||||
print(f"使用本地用户信息目录: {tmp_user_data_folder}")
|
||||
else:
|
||||
options.add_argument(
|
||||
f'--user-data-dir={c.user_folder}')
|
||||
print(f"Use specifed user data folder: {c.user_folder}, please note if you are using docker, this user folder path should be the path inside the docker container.")
|
||||
print(f"使用指定的用户信息目录: {c.user_folder},请注意如果您正在使用docker,此用户文件夹路径应是容器内的路径。")
|
||||
print(
|
||||
"如果报错Selenium.common.exceptions.WebDriverException: Message: unknown error: Chrome failed to start: exited abnormally,说明有之前运行的Chrome实例没有正常关闭,请关闭之前打开的所有Chrome实例后再运行程序即可。")
|
||||
print(
|
||||
@ -2343,9 +2339,13 @@ if __name__ == '__main__':
|
||||
print("id: ", id)
|
||||
if c.read_type == "remote":
|
||||
print("remote")
|
||||
content = requests.get(
|
||||
try:
|
||||
content = requests.get(
|
||||
c.server_address + "/queryExecutionInstance?id=" + str(id))
|
||||
service = json.loads(content.text) # 加载服务信息
|
||||
service = json.loads(content.text) # 加载服务信息
|
||||
except:
|
||||
print("Cannot connect to the server, please make sure that the EasySpider Main Program is running, or you can change the --read_type parameter to 'local' to read the task information from the local task file without keeping the EasySpider Main Program running.")
|
||||
print("无法连接到服务器,请确保EasySpider主程序正在运行,或者您可以将--read_type参数更改为'local',以实现从本地任务文件中读取任务信息而无需保持EasySpider主程序运行。")
|
||||
else:
|
||||
print("local")
|
||||
local_folder = os.path.join(os.getcwd(), "execution_instances")
|
||||
@ -2370,8 +2370,8 @@ if __name__ == '__main__':
|
||||
cloudflare = 0
|
||||
if cloudflare == 0:
|
||||
options.add_argument('log-level=3') # 隐藏日志
|
||||
path = os.path.join(os.path.abspath("./"), "Data", "Task_" + str(id))
|
||||
print("Data path:", path)
|
||||
path = os.path.join(os.path.abspath("./"), "Data", "Task_" + str(id), "files")
|
||||
print("文件下载路径|File Download path:", path)
|
||||
options.add_experimental_option("prefs", {
|
||||
# 设置文件下载路径
|
||||
"download.default_directory": path,
|
||||
@ -2396,8 +2396,17 @@ if __name__ == '__main__':
|
||||
except:
|
||||
browser = "chrome"
|
||||
if browser == "chrome":
|
||||
selenium_service = Service(executable_path=driver_path)
|
||||
browser_t = MyChrome(service=selenium_service, options=options)
|
||||
if c.docker_driver == "":
|
||||
print("Using local driver")
|
||||
selenium_service = Service(executable_path=driver_path)
|
||||
browser_t = MyChrome(service=selenium_service, options=options, mode='local_driver')
|
||||
else:
|
||||
print("Using remote driver")
|
||||
# Use docker driver, default address is http://localhost:4444/wd/hub
|
||||
# Headless mode
|
||||
# options.add_argument("--headless")
|
||||
# print("Headless mode")
|
||||
browser_t = MyChrome(command_executor=c.docker_driver, options=options, mode='remote_driver')
|
||||
elif browser == "edge":
|
||||
from selenium.webdriver.edge.service import Service as EdgeService
|
||||
from selenium.webdriver.edge.options import Options as EdgeOptions
|
||||
@ -2458,6 +2467,7 @@ if __name__ == '__main__':
|
||||
# print("Passing the Cloudflare verification mode is sometimes unstable. If the verification fails, you need to try again every few minutes, or you can change to a new user information folder and then execute the task.")
|
||||
# 使用监听器监听键盘输入
|
||||
try:
|
||||
from pynput.keyboard import Key, Listener
|
||||
if c.keyboard:
|
||||
with Listener(on_press=on_press_creator(press_time, event),
|
||||
on_release=on_release_creator(event, press_time)) as listener:
|
||||
|
@ -19,11 +19,16 @@ desired_capabilities["pageLoadStrategy"] = "none"
|
||||
|
||||
|
||||
|
||||
class MyChrome(webdriver.Chrome):
|
||||
class MyChrome(webdriver.Chrome, webdriver.Remote):
|
||||
|
||||
def __init__(self, *args, **kwargs):
|
||||
def __init__(self, mode='local_driver', *args, **kwargs):
|
||||
self.iframe_env = False # 现在的环境是root还是iframe
|
||||
super().__init__(*args, **kwargs) # 调用父类的 __init__
|
||||
self.mode = mode
|
||||
if mode == "local_driver":
|
||||
webdriver.Chrome.__init__(self, *args, **kwargs)
|
||||
elif mode == "remote_driver":
|
||||
webdriver.Remote.__init__(self, *args, **kwargs)
|
||||
# super().__init__(*args, **kwargs) # 调用父类的 __init__
|
||||
|
||||
# def find_element(self, by=By.ID, value=None, iframe=False):
|
||||
# # 在这里改变查找元素的行为
|
||||
|
@ -59,7 +59,31 @@ def send_email(config):
|
||||
smtp_server.quit()
|
||||
except:
|
||||
pass
|
||||
|
||||
def rename_downloaded_file(download_dir, stop_event):
|
||||
original_files = set(os.listdir(download_dir))
|
||||
|
||||
while not stop_event.is_set():
|
||||
files = os.listdir(download_dir)
|
||||
for file in files:
|
||||
if file in original_files:
|
||||
continue # 跳过原始文件和已重命名的文件
|
||||
|
||||
full_path = os.path.join(download_dir, file)
|
||||
|
||||
if not full_path.endswith('.crdownload') and not full_path.endswith('.htm') and not full_path.endswith('.html') and not full_path.startswith('esfile_'):
|
||||
new_name = "esfile_" + file.split('/')[-1] + '_' + str(uuid.uuid4()) + '_' + file.split('/')[-1]
|
||||
new_path = os.path.join(download_dir, new_name)
|
||||
try:
|
||||
os.rename(full_path, new_path)
|
||||
original_files.add(new_name) # 记录新文件名以避免再次重命名
|
||||
print(f"文件已重命名为|File has been renamed to: {new_path}")
|
||||
except:
|
||||
print("文件重命名失败|File rename failed")
|
||||
|
||||
time.sleep(1) # 每一秒检查一次
|
||||
# print("下载文件重命名监控中,请等待...|Download file rename monitoring, please wait...")
|
||||
print("下载文件重命名监控已停止。|Download file rename monitoring has stopped.")
|
||||
|
||||
def is_valid_url(url):
|
||||
try:
|
||||
@ -505,10 +529,17 @@ def write_to_excel(file_name, data, types, record):
|
||||
for i in range(len(line)):
|
||||
if record[i]:
|
||||
to_write.append(line[i])
|
||||
ws.append(to_write)
|
||||
try:
|
||||
ws.append(to_write)
|
||||
except:
|
||||
print("写入Excel文件失败,请检查数据类型是否正确。")
|
||||
print("Failed to write to Excel file, please check if the data type is correct.")
|
||||
# 保存工作簿
|
||||
wb.save(file_name)
|
||||
|
||||
try:
|
||||
wb.save(file_name)
|
||||
except:
|
||||
print("保存Excel文件失败,请检查文件是否被其他程序打开。")
|
||||
print("Failed to save Excel file, please check if the file is opened by other programs.")
|
||||
|
||||
class Time:
|
||||
def __init__(self, type1=""):
|
||||
|
@ -1,88 +1,17 @@
|
||||
Official Site: https://www.easyspider.net
|
||||
|
||||
Welcome to promote this software to other friends.
|
||||
Welcome to promote this software to other friends and star our Github Repository!
|
||||
|
||||
This version is for Windows 10 x64 and above.
|
||||
This version is for Windows 10/Windows Server 2016 x64 and above.
|
||||
|
||||
The Windows version supports **Windows 10 and above**. If you want to use EasySpider on windows 7, please download the Windows x32 version of EasySpider.
|
||||
If you want to use EasySpider on windows 7, please download the Windows x32 version of EasySpider. There is no version support for Windows Server 2012 and below. These systems require manual compilation for execution.
|
||||
|
||||
The software's open-source code repository on GitHub: https://github.com/NaiboWang/EasySpider
|
||||
|
||||
Official documentation can be found at: https://github.com/NaiboWang/EasySpider/wiki
|
||||
|
||||
Video Tutorial: https://youtube.com/playlist?list=PL0kEFEkWrT7mt9MUlEBV2DTo1QsaanUTp
|
||||
|
||||
The software is totally not trojan/virus! If mistaken by antivirus software such as Windows Defender as a virus, please recover it, or open "EasySpider.bat" to run our software instead.
|
||||
|
||||
Tasks can be imported from other machines by simply placing the .json files from the "tasks" folder of those machines into the "tasks" folder of this directory. Similarly, execution instance files can be imported by copying the .json files from the "execution_instances" folder. Note that only files named with a number greater than 0 are supported in both folders.
|
||||
|
||||
|
||||
======Version Update Instructions======
|
||||
|
||||
Please see more new features for version greater than v0.3.2 at github release page: https://github.com/NaiboWang/EasySpider/releases
|
||||
|
||||
-----v0.3.2-----
|
||||
|
||||
## Update Instruction
|
||||
|
||||
1. Selected child element operations can delete fields and unmark deleted fields in real-time in the browser.
|
||||
2. Selecting child elements adds a selection mode that allows you to choose only the child elements that are present in all blocks or the child elements that are the same as the first selected block.
|
||||
3. In the text input and webpage open options, you can use the extracted field value as a variable for text input, represented by Field["field_name"].
|
||||
4. Files can be downloaded, such as PDF files.
|
||||
5. Fixed a bug where the software could display a blank screen for about 10 seconds after opening, making it usable in intranets, darknets, and any local network.
|
||||
6. Fixed a bug where the current page URL and title could not be extracted.
|
||||
7. Fixed a bug where OCR recognition could fail to extract information.
|
||||
8. Updated extraction logic to save locally every 10 records collected.
|
||||
9. When modifying a task, the default anchor position is set to after the last operation in the task flow.
|
||||
10. Updated Chrome version to 114.
|
||||
|
||||
-----v0.3.1-----
|
||||
|
||||
## Update Instruction
|
||||
|
||||
|
||||
1. Advanced Operations:
|
||||
|
||||
- Custom scripts can be executed in the workflow, including executing JavaScript commands in the browser and invoking scripts at the operating system level. The command's return value can be obtained and recorded, greatly expanding the scope of operations.
|
||||
- Before and after each operation, you can specify a JavaScript command to be executed targeting the current located element.
|
||||
|
||||
2. Custom scripts are also supported in the conditions and loop conditions. The return value of the custom script determines the condition for the judgment of conditions and loops, greatly enhancing the flexibility of tasks. The ability to use the break statement within a loop is added, allowing custom operations to manipulate elements within the loop.
|
||||
|
||||
|
||||
3. Multiple XPath expressions are generated simultaneously for user selection, and the XPath Helper extension is pre-installed for XPath debugging.
|
||||
|
||||
4. Added the functionality to extract the background image URL of elements, current page title, and current page URL.
|
||||
|
||||
5. Added the capability to save screenshots of elements or entire web pages. This feature works best in headless mode.
|
||||
|
||||
6. Added the functionality to download images.
|
||||
|
||||
7. Added OCR recognition of elements. To use this feature, Tesseract library needs to be installed first: https://tesseract-ocr.github.io/tessdoc/Installation.html
|
||||
|
||||
8. Directly extract the return value of executing JavaScript code on elements, allowing for functionalities such as regular expression matching and obtaining the background color of elements.
|
||||
|
||||
9. Added the capability to switch dropdown options and extract the selected value and text of dropdown options.
|
||||
|
||||
10. Significantly improved user guidance and explanations to make the software more user-friendly. This includes instructions on handling iframe tags, explanations of parameter meanings for various options, and explanations on modifying the XPath for loop items, and more.
|
||||
|
||||
11. Added instructions on how to execute tasks from the command line.
|
||||
|
||||
12. Added headless mode configuration, allowing the software to run without a browser interface.
|
||||
|
||||
13. Fixed the issue where Chinese paths couldn't be recognized correctly when using user-configured browser modes.
|
||||
|
||||
14. Fixed the issue where the program would freeze when there was no unconditional branch in the conditional branching.
|
||||
|
||||
15. Fixed the issue where the input box would freeze after saving a task.
|
||||
|
||||
16. Added the option to set the maximum waiting time for page load in the "Open Page" and "Click element" operations.
|
||||
|
||||
17. Added the functionality to move the mouse to an element.
|
||||
|
||||
18. Displays a prompt when an element cannot be found.
|
||||
|
||||
19. Fixed the webpage scrolling bug.
|
||||
|
||||
20. The task name is initialized with the value of the page title upon the first visit.
|
||||
|
||||
21. Added version update prompts.
|
||||
|
||||
22. Added the information of the publisher as requested.
|
||||
|
||||
23. Updated Chrome version to 113.
|
||||
|
@ -0,0 +1,10 @@
|
||||
打开报错:DiscardVirtualMemory...KERNEL32.dll说明如下:
|
||||
|
||||
64位版本的易采集EasySpider只支持支持Windows 10/Windows Server 2016 x64及以上版本。
|
||||
对于Windows 7任意版本,包括x64和x32版本,以及Windows 10 x32版本请下载Windows的32位版本使用。无任何版本支持Windows Server 2012及以下版本系统,这些系统下需要自行编译运行。
|
||||
|
||||
If you open the software and see an error like: DiscardVirtualMemory...KERNEL32.dll, the reason is:
|
||||
|
||||
This 64-bit version of EasySpider is for Windows 10/Windows Server 2016 x64 and above.
|
||||
If you want to use EasySpider on windows 7, please download the Windows x32 version of EasySpider. There is no version support for Windows Server 2012 and below. These systems require manual compilation for execution.
|
||||
|
@ -23,7 +23,7 @@ For more complex operations, please download the source code and compile it for
|
||||
"""
|
||||
|
||||
# 请在下面编写你的代码,不要有代码缩进!!! | Please write your code below, do not indent the code!!!
|
||||
|
||||
print(globals())
|
||||
# 导包 | Import packages
|
||||
from selenium.common.exceptions import ElementClickInterceptedException
|
||||
|
||||
@ -56,3 +56,20 @@ finally:
|
||||
print("All parameters:", self.outputParameters)
|
||||
print(test(3))
|
||||
print("执行完毕|Execution completed")
|
||||
|
||||
import time
|
||||
time.sleep(3)
|
||||
|
||||
def new_line(outputParameters, maxViewLength, record):
|
||||
line = []
|
||||
print("Use this function to print a new line in the console")
|
||||
i = 0
|
||||
for value in outputParameters.values():
|
||||
line.append(value)
|
||||
if record[i]:
|
||||
print(value[:maxViewLength], " ", end="")
|
||||
i += 1
|
||||
print("")
|
||||
return line
|
||||
|
||||
new_line(self.outputParameters, 10, [True, True, True, True, True, True, True, True, True, True, True, True, True, True, True])
|
File diff suppressed because one or more lines are too long
@ -1 +1 @@
|
||||
{"id":228,"name":"[2312.02977] Exploring the nonclassical dynamics of the \"classical'' Schrödinger equation","url":"https://arxiv.org/abs/2312.02977","links":"https://arxiv.org/abs/2312.02977","create_time":"12/7/2023, 2:44:58 AM","update_time":"12/7/2023, 2:56:47 AM","version":"0.6.0","saveThreshold":10,"quitWaitTime":60,"environment":1,"maximizeWindow":0,"maxViewLength":15,"recordLog":1,"outputFormat":"xlsx","saveName":"current_time","inputExcel":"","startFromExit":0,"pauseKey":"p","containJudge":false,"desc":"https://arxiv.org/abs/2312.02977","inputParameters":[{"id":0,"name":"urlList_0","nodeId":1,"nodeName":"打开网页","value":"https://arxiv.org/abs/2312.02977","desc":"要采集的网址列表,多行以\\n分开","type":"text","exampleValue":"https://arxiv.org/abs/2312.02977"}],"outputParameters":[],"graph":[{"index":0,"id":0,"parentId":0,"type":-1,"option":0,"title":"root","sequence":[1,2],"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0},"isInLoop":false},{"id":1,"index":1,"parentId":0,"type":0,"option":1,"title":"打开网页","sequence":[],"isInLoop":false,"position":0,"parameters":{"useLoop":false,"xpath":"","wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"url":"https://arxiv.org/abs/2312.02977","links":"https://arxiv.org/abs/2312.02977","maxWaitTime":10,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"cookies":""}},{"id":2,"index":2,"parentId":0,"type":0,"option":2,"title":"点击Download PDF","sequence":[],"isInLoop":false,"position":1,"parameters":{"history":4,"tabIndex":-1,"useLoop":false,"xpath":"//*[contains(@class, \"download-pdf\")]","iframe":false,"wait":2,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"clickWay":0,"maxWaitTime":10,"params":[],"alertHandleType":0,"allXPaths":["/html/body/div[2]/main[1]/div[1]/div[1]/div[2]/div[1]/ul[1]/li[1]/a[1]","//a[contains(., 'Download P')]","//A[@class='abs-button download-pdf']","/html/body/div[last()-3]/main/div/div/div[last()-2]/div[last()-5]/ul/li[last()-2]/a"]}}]}
|
||||
{"id":228,"name":"[2312.02977] Exploring the nonclassical dynamics of the \"classical'' Schrödinger equation","url":"https://arxiv.org/abs/2312.02977","links":"https://arxiv.org/abs/2312.02977","create_time":"12/7/2023, 2:44:58 AM","update_time":"2024-01-05 22:08:46","version":"0.6.0","saveThreshold":10,"quitWaitTime":3,"environment":1,"maximizeWindow":0,"maxViewLength":15,"recordLog":1,"outputFormat":"xlsx","saveName":"TTT","dataWriteMode":3,"inputExcel":"","startFromExit":0,"pauseKey":"p","containJudge":false,"browser":"chrome","removeDuplicate":0,"desc":"https://arxiv.org/abs/2312.02977","inputParameters":[{"id":0,"name":"urlList_0","nodeId":1,"nodeName":"打开网页","value":"https://arxiv.org/abs/2312.02977","desc":"要采集的网址列表,多行以\\n分开","type":"text","exampleValue":"https://arxiv.org/abs/2312.02977"},{"id":1,"name":"loopTimes_1","nodeId":5,"nodeName":"循环 - 单个元素","desc":"循环循环 - 单个元素执行的次数(0代表无限循环)","type":"int","exampleValue":10,"value":10}],"outputParameters":[],"graph":[{"index":0,"id":0,"parentId":0,"type":-1,"option":0,"title":"root","sequence":[1,5],"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0},"isInLoop":false},{"id":1,"index":1,"parentId":0,"type":0,"option":1,"title":"打开网页","sequence":[],"isInLoop":false,"position":0,"parameters":{"useLoop":false,"xpath":"","wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"url":"https://arxiv.org/abs/2312.02977","links":"https://arxiv.org/abs/2312.02977","maxWaitTime":10,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"cookies":""}},{"id":3,"index":2,"parentId":2,"type":0,"option":2,"title":"点击Download PDF","sequence":[],"isInLoop":true,"position":0,"parameters":{"history":4,"tabIndex":-1,"useLoop":false,"xpath":"//*[contains(@class, \"download-pdf\")]","iframe":false,"wait":2,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"clickWay":0,"maxWaitTime":10,"params":[],"alertHandleType":0,"allXPaths":["/html/body/div[2]/main[1]/div[1]/div[1]/div[2]/div[1]/ul[1]/li[1]/a[1]","//a[contains(., 'Download P')]","//A[@class='abs-button download-pdf']","/html/body/div[last()-3]/main/div/div/div[last()-2]/div[last()-5]/ul/li[last()-2]/a"]}},{"id":-1,"index":3,"parentId":0,"type":0,"option":2,"title":"点击Download PDF","sequence":[],"isInLoop":false,"position":2,"parameters":{"history":4,"tabIndex":-1,"useLoop":false,"xpath":"//*[contains(@class, \"download-pdf\")]","iframe":false,"wait":2,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"clickWay":0,"maxWaitTime":10,"params":[],"alertHandleType":0,"allXPaths":["/html/body/div[2]/main[1]/div[1]/div[1]/div[2]/div[1]/ul[1]/li[1]/a[1]","//a[contains(., 'Download P')]","//A[@class='abs-button download-pdf']","/html/body/div[last()-3]/main/div/div/div[last()-2]/div[last()-5]/ul/li[last()-2]/a"]}},{"id":-1,"index":4,"parentId":0,"type":0,"option":2,"title":"点击Download PDF","sequence":[],"isInLoop":false,"position":3,"parameters":{"history":4,"tabIndex":-1,"useLoop":false,"xpath":"//*[contains(@class, \"download-pdf\")]","iframe":false,"wait":2,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"clickWay":0,"maxWaitTime":10,"params":[],"alertHandleType":0,"allXPaths":["/html/body/div[2]/main[1]/div[1]/div[1]/div[2]/div[1]/ul[1]/li[1]/a[1]","//a[contains(., 'Download P')]","//A[@class='abs-button download-pdf']","/html/body/div[last()-3]/main/div/div/div[last()-2]/div[last()-5]/ul/li[last()-2]/a"]}},{"id":2,"index":5,"parentId":0,"type":1,"option":8,"title":"循环 - 单个元素","sequence":[2],"isInLoop":false,"position":1,"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"//body","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"loopType":0,"pathList":"","textList":"","code":"","waitTime":0,"exitCount":10,"exitElement":"//body","historyWait":2,"breakMode":0,"breakCode":"","breakCodeWaitTime":0,"skipCount":0}}]}
|
@ -1 +1 @@
|
||||
{"id":229,"name":"知乎 - 有问题,就会有答案","url":"https://www.zhihu.com","links":"https://www.zhihu.com","create_time":"07/12/2023, 03:26:24","update_time":"07/12/2023, 03:43:34","version":"0.6.0","saveThreshold":10,"quitWaitTime":6,"environment":0,"maximizeWindow":0,"maxViewLength":15,"recordLog":1,"outputFormat":"xlsx","saveName":"current_time","inputExcel":"","startFromExit":0,"pauseKey":"t","containJudge":false,"desc":"https://www.zhihu.com","inputParameters":[{"id":0,"name":"urlList_0","nodeId":1,"nodeName":"打开网页","value":"https://www.zhihu.com","desc":"要采集的网址列表,多行以\\n分开","type":"text","exampleValue":"https://www.zhihu.com"}],"outputParameters":[{"id":0,"name":"参数1_文本","desc":"","type":"text","recordASField":1,"exampleValue":"死刑执行前可以谎称肚子痛,想排泄粪便,籍此拖延时间吗?"}],"graph":[{"index":0,"id":0,"parentId":0,"type":-1,"option":0,"title":"root","sequence":[1,2],"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0},"isInLoop":false},{"id":1,"index":1,"parentId":0,"type":0,"option":1,"title":"打开网页","sequence":[],"isInLoop":false,"position":0,"parameters":{"useLoop":false,"xpath":"","wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"url":"https://www.zhihu.com","links":"https://www.zhihu.com","maxWaitTime":10,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"cookies":""}},{"id":2,"index":2,"parentId":0,"type":1,"option":8,"title":"循环采集数据","sequence":[3],"isInLoop":false,"position":1,"parameters":{"history":5,"tabIndex":-1,"useLoop":false,"xpath":"/html/body/div[1]/div[1]/main[1]/div[1]/div[2]/div[1]/div[1]/div[2]/div[1]/div[1]/div[1]/div/div[1]/div[1]/div[1]/div[1]/h2[1]/div[1]","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"loopType":1,"pathList":"","textList":"","code":"","waitTime":0,"exitCount":0,"exitElement":"//body","historyWait":2,"breakMode":0,"breakCode":"","breakCodeWaitTime":0,"allXPaths":["/html/body/div[1]/div[1]/main[1]/div[1]/div[2]/div[1]/div[1]/div[2]/div[1]/div[1]/div[1]/div[2]/div[1]/div[1]/div[1]/div[1]/h2[1]/div[1]","//div[contains(., '死刑执行前可以谎称肚')]","/html/body/div[last()-7]/div/main/div/div/div[last()-1]/div/div/div/div/div/div[last()-12]/div/div/div/div/h2/div"]}},{"id":3,"index":3,"parentId":2,"type":0,"option":3,"title":"提取数据","sequence":[],"isInLoop":true,"position":0,"parameters":{"history":5,"tabIndex":-1,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"clear":0,"newLine":1,"params":[{"nodeType":0,"contentType":0,"relative":true,"name":"参数1_文本","desc":"","extractType":0,"relativeXPath":"","allXPaths":"","exampleValues":[{"num":0,"value":"死刑执行前可以谎称肚子痛,想排泄粪便,籍此拖延时间吗?"}],"unique_index":"onlvi030w9jlpu5tjzb","iframe":false,"default":"","paraType":"text","recordASField":1,"beforeJS":"","beforeJSWaitTime":0,"JS":"","JSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"downloadPic":0}],"loopType":1}}]}
|
||||
{"id":229,"name":"知乎 - 有问题,就会有答案","url":"https://www.zhihu.com","links":"https://www.zhihu.com","create_time":"07/12/2023, 03:26:24","update_time":"2023-12-27 20:05:50","version":"0.6.0","saveThreshold":10,"quitWaitTime":6,"environment":0,"maximizeWindow":0,"maxViewLength":15,"recordLog":1,"outputFormat":"xlsx","saveName":"current_time","dataWriteMode":1,"inputExcel":"","startFromExit":0,"pauseKey":"t","containJudge":false,"browser":"chrome","removeDuplicate":0,"desc":"知了个乎","inputParameters":[{"id":0,"name":"urlList_0","nodeId":1,"nodeName":"打开网页","value":"https://www.zhihu.com","desc":"要采集的网址列表,多行以\\n分开","type":"text","exampleValue":"https://www.zhihu.com"},{"id":1,"name":"loopTimes_1","nodeId":4,"nodeName":"循环 - 单个元素","desc":"循环循环 - 单个元素执行的次数(0代表无限循环)","type":"int","exampleValue":0,"value":0}],"outputParameters":[{"id":0,"name":"参数1_文本","desc":"","type":"text","recordASField":1,"exampleValue":"死刑执行前可以谎称肚子痛,想排泄粪便,籍此拖延时间吗?"}],"graph":[{"index":0,"id":0,"parentId":0,"type":-1,"option":0,"title":"root","sequence":[1,4,2],"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0},"isInLoop":false},{"id":1,"index":1,"parentId":0,"type":0,"option":1,"title":"打开网页","sequence":[],"isInLoop":false,"position":0,"parameters":{"useLoop":false,"xpath":"","wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"url":"https://www.zhihu.com","links":"https://www.zhihu.com","maxWaitTime":10,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"cookies":""}},{"id":3,"index":2,"parentId":0,"type":1,"option":8,"title":"循环采集数据","sequence":[3],"isInLoop":false,"position":2,"parameters":{"history":5,"tabIndex":-1,"useLoop":false,"xpath":"/html/body/div[1]/div[1]/main[1]/div[1]/div[2]/div[1]/div[1]/div[2]/div[1]/div[1]/div[1]/div/div[1]/div[1]/div[1]/div[1]/h2[1]/div[1]","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"loopType":1,"pathList":"","textList":"","code":"","waitTime":0,"exitCount":0,"exitElement":"//body","historyWait":2,"breakMode":0,"breakCode":"","breakCodeWaitTime":0,"allXPaths":["/html/body/div[1]/div[1]/main[1]/div[1]/div[2]/div[1]/div[1]/div[2]/div[1]/div[1]/div[1]/div[2]/div[1]/div[1]/div[1]/div[1]/h2[1]/div[1]","//div[contains(., '死刑执行前可以谎称肚')]","/html/body/div[last()-7]/div/main/div/div/div[last()-1]/div/div/div/div/div/div[last()-12]/div/div/div/div/h2/div"]}},{"id":4,"index":3,"parentId":3,"type":0,"option":3,"title":"提取数据","sequence":[],"isInLoop":true,"position":0,"parameters":{"history":5,"tabIndex":-1,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"clear":0,"newLine":1,"params":[{"nodeType":0,"contentType":0,"relative":true,"name":"参数1_文本","desc":"","extractType":0,"relativeXPath":"","allXPaths":"","exampleValues":[{"num":0,"value":"死刑执行前可以谎称肚子痛,想排泄粪便,籍此拖延时间吗?"}],"unique_index":"onlvi030w9jlpu5tjzb","iframe":false,"default":"","paraType":"text","recordASField":1,"beforeJS":"","beforeJSWaitTime":0,"JS":"","JSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"downloadPic":0}],"loopType":1}},{"id":2,"index":4,"parentId":0,"type":1,"option":8,"title":"循环 - 单个元素","sequence":[],"isInLoop":false,"position":1,"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"loopType":0,"pathList":"","textList":"","code":"","waitTime":0,"exitCount":0,"exitElement":"//body","historyWait":2,"breakMode":0,"breakCode":"","breakCodeWaitTime":0,"skipCount":0}}]}
|
@ -1 +1 @@
|
||||
{"id":70,"name":"京东全球版-专业的综合网上购物商城","url":"https://www.jd.com","links":"https://www.jd.com","create_time":"5/24/2023, 8:21:45 PM","version":"0.3.1","containJudge":false,"desc":"https://www.jd.com","inputParameters":[{"id":0,"name":"urlList_0","nodeId":1,"nodeName":"打开网页","value":"https://www.jd.com","desc":"要采集的网址列表,多行以\\n分开","type":"string","exampleValue":"https://www.jd.com"}],"outputParameters":[],"graph":[{"index":0,"id":0,"parentId":0,"type":-1,"option":0,"title":"root","sequence":[1,2],"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"","wait":0},"isInLoop":false},{"id":1,"index":1,"parentId":0,"type":0,"option":1,"title":"打开网页","sequence":[],"isInLoop":false,"position":0,"parameters":{"useLoop":false,"xpath":"","wait":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"url":"https://www.jd.com","links":"https://www.jd.com","maxWaitTime":10,"scrollType":0,"scrollCount":1,"scrollWaitTime":1}},{"id":2,"index":2,"parentId":0,"type":1,"option":8,"title":"循环","sequence":[3],"isInLoop":false,"position":1,"parameters":{"history":4,"tabIndex":-1,"useLoop":false,"xpath":"/html/body/div[5]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div","wait":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"loopType":1,"pathList":"","textList":"","code":"","waitTime":0,"exitCount":0,"historyWait":2,"breakMode":0,"breakCode":"","breakCodeWaitTime":0,"allXPaths":""}},{"id":3,"index":3,"parentId":2,"type":0,"option":7,"title":"移动到元素","sequence":[],"isInLoop":true,"position":0,"parameters":{"history":4,"tabIndex":-1,"useLoop":true,"xpath":"/html/body/div[5]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div","wait":2,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"allXPaths":"","loopType":1}}]}
|
||||
{"id":-2,"name":"京东全球版-专业的综合网上购物商城","url":"https://www.jd.com","links":"https://www.jd.com","create_time":"5/24/2023, 8:21:45 PM","version":"0.3.1","containJudge":false,"desc":"https://www.jd.com","inputParameters":[{"id":0,"name":"urlList_0","nodeId":1,"nodeName":"打开网页","value":"https://www.jd.com","desc":"要采集的网址列表,多行以\\n分开","type":"string","exampleValue":"https://www.jd.com"}],"outputParameters":[],"graph":[{"index":0,"id":0,"parentId":0,"type":-1,"option":0,"title":"root","sequence":[1,2],"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"","wait":0},"isInLoop":false},{"id":1,"index":1,"parentId":0,"type":0,"option":1,"title":"打开网页","sequence":[],"isInLoop":false,"position":0,"parameters":{"useLoop":false,"xpath":"","wait":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"url":"https://www.jd.com","links":"https://www.jd.com","maxWaitTime":10,"scrollType":0,"scrollCount":1,"scrollWaitTime":1}},{"id":2,"index":2,"parentId":0,"type":1,"option":8,"title":"循环","sequence":[3],"isInLoop":false,"position":1,"parameters":{"history":4,"tabIndex":-1,"useLoop":false,"xpath":"/html/body/div[5]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div","wait":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"loopType":1,"pathList":"","textList":"","code":"","waitTime":0,"exitCount":0,"historyWait":2,"breakMode":0,"breakCode":"","breakCodeWaitTime":0,"allXPaths":""}},{"id":3,"index":3,"parentId":2,"type":0,"option":7,"title":"移动到元素","sequence":[],"isInLoop":true,"position":0,"parameters":{"history":4,"tabIndex":-1,"useLoop":true,"xpath":"/html/body/div[5]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div[1]/div","wait":2,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"allXPaths":"","loopType":1}}]}
|
File diff suppressed because one or more lines are too long
@ -1,105 +1,17 @@
|
||||
欢迎将软件宣传给更多需要的朋友!
|
||||
欢迎将软件宣传给更多需要的朋友和Star我们的Github仓库!
|
||||
|
||||
官方网址: https://www.easyspider.cn
|
||||
|
||||
支持Windows 10 x64及以上版本。
|
||||
支持Windows 10/Windows Server 2016 x64及以上版本。
|
||||
|
||||
Windows 7任意版本,包括x64和x32版本,以及Windows 10 x32版本请下载Windows的32位版本使用。
|
||||
Windows 7任意版本,包括x64和x32版本,以及Windows 10 x32版本请下载Windows的32位版本使用。无任何版本支持Windows Server 2012及以下版本系统,这些系统下需要自行编译运行。
|
||||
|
||||
软件开源代码Github库地址:https://github.com/NaiboWang/EasySpider
|
||||
|
||||
官方文档地址:https://github.com/NaiboWang/EasySpider/wiki
|
||||
|
||||
视频教程:https://www.bilibili.com/video/BV1th411A7ey/
|
||||
|
||||
这个软件绝对不是特洛伊木马/病毒!如果被像Windows Defender这样的杀毒软件误认为是病毒,请进行恢复,或者打开“EasySpider.bat”来运行我们的软件。
|
||||
|
||||
可以从其他机器导入任务,只需要把其他机器的tasks文件夹里的.json文件放入此目录的tasks文件夹里即可。同理执行号文件可以通过复制execution_instances文件夹中的.json文件来导入。注意,两个文件夹里的.json文件只支持命名为大于0的数字。
|
||||
|
||||
|
||||
======版本更新说明======
|
||||
|
||||
v0.3.2以上版本更新说明请查看Github Release Pages页面:https://github.com/NaiboWang/EasySpider/releases
|
||||
|
||||
-----v0.3.2-----
|
||||
|
||||
## 更新说明
|
||||
|
||||
1. 选中子元素操作可删除字段并在浏览器中实时取消标记被删除的字段。
|
||||
<img src="https://github.com/NaiboWang/EasySpider/assets/30287768/e016c832-6ff9-4814-b86c-38787e73aa30" width=50% />
|
||||
|
||||
2. 选中子元素增加选择模式,可以只选择所有块都有的子元素,或者所有块中和第一个选中的块相同的子元素。
|
||||
<img src="https://github.com/NaiboWang/EasySpider/assets/30287768/0082b11d-96bc-43f1-acdb-8280decb48b4" width=50% />
|
||||
|
||||
3. 输入文字和打开网页选项中可以使用最后一次提取到的字段值**作为变量**进行文字输入,用`Field["字段名"]`表示此变量。
|
||||

|
||||
|
||||
4. 可下载文件,如PDF。
|
||||
5. 修复打开后有可能会白屏10秒左右的Bug,使得在内网,暗网以及任意局域网都可以使用软件。
|
||||
6. 修复提取当前页面URL和标题时可能提取不到的bug。
|
||||
7. 修复OCR识别可能提取不到的bug。
|
||||
8. 提取逻辑更新为每采集10条本地保存一次。
|
||||
9. 修改任务时默认锚点位置为任务流程的最后操作后。
|
||||
10. 更新Chrome版本为114。
|
||||
|
||||
|
||||
-----v0.3.1-----
|
||||
|
||||
### 强烈建议大家观看新特性讲解视频
|
||||
|
||||
B站最新版特性视频已上传,新视频非常有用,推荐大家观看。
|
||||
|
||||
[【重要】自定义条件判断之使用循环项内的JS命令返回值 - 第二弹](https://www.bilibili.com/video/BV1mu411x7Nn/)
|
||||
|
||||
[如何同时执行多个任务(并行多开)](https://www.bilibili.com/video/BV13c411G7LE/)
|
||||
|
||||
[如何执行自己写的JS代码和系统代码 (自定义操作)](https://www.bilibili.com/video/BV1qs4y1z7Hc/)
|
||||
|
||||
[如何自定义循环和判断条件 - 第一弹](https://www.bilibili.com/video/BV1Ys4y1z777/)
|
||||
|
||||
[如何对元素和网页截图及(无头模式)命令行执行指南](https://www.bilibili.com/video/BV1dV4y1z764/)
|
||||
|
||||
[OCR识别元素内容功能](https://www.bilibili.com/video/BV1xz4y1b72D/)
|
||||
|
||||
注意,v0.3.1版本任务task文件夹内`.json`文件和之前所有版本均不兼容,请重新设计v0.3.1版本任务。
|
||||
|
||||
## 更新说明
|
||||
1. 自定义操作:
|
||||
- 可以在任务流程中**执行自定义脚本**,包括在浏览器中**执行Javascript指令**以及**操作系统级别的脚本调用**并可**得到命令返回值并记录**,大大扩展了可操作空间。
|
||||
|
||||

|
||||
|
||||
- 在每一个操作执行前和执行后,都可以指定执行一段针对当前定位元素的JavaScript指令。
|
||||
|
||||
<img src="https://github.com/NaiboWang/EasySpider/assets/30287768/dde64388-5668-40ff-951e-fb8f60655c49" height=50% width=50%>
|
||||
|
||||
2. **判断条件和循环条件**中同样增加了**执行自定义脚本**,并根据自定义脚本的返回值是否为真来作为条件判断和循环的判断条件,同样极大的增加了任务的可操作性。循环中增加了用代码break的操作设定,自定义操作可以操作循环内元素。
|
||||

|
||||
<img src="https://github.com/NaiboWang/EasySpider/assets/30287768/5ce7cf50-e5c9-4714-a83b-9c65934e9c68" width=50%></img>
|
||||
|
||||
3. 可同时生成多种XPath供用户选择,并**预装了XPath Helper扩展**供大家调试XPath。
|
||||
4. 增加采集元素背景图片地址,当前页面标题,当前页面URL地址功能。
|
||||
5. 增加保存元素截图功能,如要截图某元素或整个网页页面,可以用此功能(配合无头模式效果更好)。
|
||||
6. 增加下载图片功能。
|
||||
7. 增加OCR识别元素功能(使用此功能需首先自行安装Tesseract库:[https://blog.csdn.net/u010454030/article/details/80515501](https://blog.csdn.net/u010454030/article/details/80515501))
|
||||
|
||||
8. 可直接提取对元素执行JavaScript代码后的返回值,实现如正则表达式,获得元素背景颜色等功能。
|
||||
9. 增加切换下拉选项功能,采集下拉选项正在选中的值和文本。
|
||||
<img src="https://github.com/NaiboWang/EasySpider/assets/30287768/c0b2bec1-2a97-4516-930e-1b310697212b" width=50%></img>
|
||||
|
||||

|
||||
|
||||
10. 大幅增加使用提示和说明,使软件更易用(如增加了iframe标签的处理方式说明,各个选项的参数意义,以及循环项XPath的修改说明等等)。
|
||||
11. 执行命令时增加了如何用命令行执行任务的提示:[https://github.com/NaiboWang/EasySpider/wiki/Argument-Instruction](https://github.com/NaiboWang/EasySpider/wiki/Argument-Instruction)。
|
||||

|
||||
12. 增加并行多开模式。
|
||||
13. 增加无头模式,即无浏览器界面模式配置。
|
||||
14. 修复了使用用户配置浏览器模式下的中文路径不能正确识别的问题。
|
||||
15. 修复了条件分支没有无条件分支时会卡死的问题。
|
||||
16. 修复了保存任务后会输入框卡死的问题。
|
||||
17. 打开网页操作和点击元素操作新增设置页面最长加载等待时间。
|
||||
18. 增加了鼠标移动到元素功能。
|
||||
19. 找不到元素时会提示。
|
||||
20. 修复网页滚动Bug。
|
||||
21. 增加新增提取数据字段操作。
|
||||
22. 任务名称初始化为第一次进入页面的标题值。
|
||||
23. 增加版本更新提示。
|
||||
24. 应要求增加出品方信息。
|
||||
25. 更新chrome版本为113。
|
||||
|
||||
|
@ -1 +1 @@
|
||||
Note: The various folders within this directory are not directly usable software, but temporary folders used by the author at the time of release. Please visit the official website to download readily usable software packages: https://www.easyspider.cn
|
||||
Note: The various folders within this directory are not directly usable software, but temporary folders used by the author at the time of release. Please visit the official website to download readily usable software packages: https://www.easyspider.net
|
@ -64,49 +64,49 @@ def compress_folder_to_7z_split(folder_path, output_file):
|
||||
except:
|
||||
subprocess.call(["7zz", "a", "-v95m", output_file, folder_path])
|
||||
|
||||
easyspider_version = "0.6.0"
|
||||
easyspider_version = "0.6.3"
|
||||
|
||||
if __name__ == "__main__":
|
||||
|
||||
if sys.platform == "win32" and platform.architecture()[0] == "64bit":
|
||||
file_name = f"EasySpider_{easyspider_version}_windows_x64.7z"
|
||||
if os.path.exists("./EasySpider_windows_x64/user_data"):
|
||||
shutil.rmtree("./EasySpider_windows_x64/user_data")
|
||||
if os.path.exists("./EasySpider_windows_x64/Data"):
|
||||
shutil.rmtree("./EasySpider_windows_x64/Data")
|
||||
if os.path.exists("./EasySpider_windows_x64/execution_instances"):
|
||||
shutil.rmtree("./EasySpider_windows_x64/execution_instances")
|
||||
if os.path.exists("./EasySpider_windows_x64/config.json"):
|
||||
os.remove("./EasySpider_windows_x64/config.json")
|
||||
if os.path.exists("./EasySpider_windows_x64/mysql_config.json"):
|
||||
os.remove("./EasySpider_windows_x64/mysql_config.json")
|
||||
if os.path.exists("./EasySpider_windows_x64/TempUserDataFolder"):
|
||||
shutil.rmtree("./EasySpider_windows_x64/TempUserDataFolder")
|
||||
os.mkdir("./EasySpider_windows_x64/Data")
|
||||
os.mkdir("./EasySpider_windows_x64/execution_instances")
|
||||
# compress_folder_to_7z_split("./EasySpider_windows_x64", file_name)
|
||||
file_name = f"EasySpider_{easyspider_version}_Windows_x64.7z"
|
||||
if os.path.exists("./EasySpider_Windows_x64/user_data"):
|
||||
shutil.rmtree("./EasySpider_Windows_x64/user_data")
|
||||
if os.path.exists("./EasySpider_Windows_x64/Data"):
|
||||
shutil.rmtree("./EasySpider_Windows_x64/Data")
|
||||
if os.path.exists("./EasySpider_Windows_x64/execution_instances"):
|
||||
shutil.rmtree("./EasySpider_Windows_x64/execution_instances")
|
||||
if os.path.exists("./EasySpider_Windows_x64/config.json"):
|
||||
os.remove("./EasySpider_Windows_x64/config.json")
|
||||
if os.path.exists("./EasySpider_Windows_x64/mysql_config.json"):
|
||||
os.remove("./EasySpider_Windows_x64/mysql_config.json")
|
||||
if os.path.exists("./EasySpider_Windows_x64/TempUserDataFolder"):
|
||||
shutil.rmtree("./EasySpider_Windows_x64/TempUserDataFolder")
|
||||
os.mkdir("./EasySpider_Windows_x64/Data")
|
||||
os.mkdir("./EasySpider_Windows_x64/execution_instances")
|
||||
# compress_folder_to_7z_split("./EasySpider_Windows_x64", file_name)
|
||||
# print(f"Compress {file_name} Split successfully!")
|
||||
compress_folder_to_7z("./EasySpider_windows_x64", file_name)
|
||||
compress_folder_to_7z("./EasySpider_Windows_x64", file_name)
|
||||
print(f"Compress {file_name} successfully!")
|
||||
elif sys.platform == "win32" and platform.architecture()[0] == "32bit":
|
||||
file_name = f"EasySpider_{easyspider_version}_windows_x32.7z"
|
||||
if os.path.exists("./EasySpider_windows_x32/user_data"):
|
||||
shutil.rmtree("./EasySpider_windows_x32/user_data")
|
||||
if os.path.exists("./EasySpider_windows_x32/Data"):
|
||||
shutil.rmtree("./EasySpider_windows_x32/Data")
|
||||
if os.path.exists("./EasySpider_windows_x32/execution_instances"):
|
||||
shutil.rmtree("./EasySpider_windows_x32/execution_instances")
|
||||
if os.path.exists("./EasySpider_windows_x32/config.json"):
|
||||
os.remove("./EasySpider_windows_x32/config.json")
|
||||
if os.path.exists("./EasySpider_windows_x32/mysql_config.json"):
|
||||
os.remove("./EasySpider_windows_x32/mysql_config.json")
|
||||
if os.path.exists("./EasySpider_windows_x32/TempUserDataFolder"):
|
||||
shutil.rmtree("./EasySpider_windows_x32/TempUserDataFolder")
|
||||
os.mkdir("./EasySpider_windows_x32/Data")
|
||||
os.mkdir("./EasySpider_windows_x32/execution_instances")
|
||||
# compress_folder_to_7z_split("./EasySpider_windows_x32", file_name)
|
||||
file_name = f"EasySpider_{easyspider_version}_Windows_x32.7z"
|
||||
if os.path.exists("./EasySpider_Windows_x32/user_data"):
|
||||
shutil.rmtree("./EasySpider_Windows_x32/user_data")
|
||||
if os.path.exists("./EasySpider_Windows_x32/Data"):
|
||||
shutil.rmtree("./EasySpider_Windows_x32/Data")
|
||||
if os.path.exists("./EasySpider_Windows_x32/execution_instances"):
|
||||
shutil.rmtree("./EasySpider_Windows_x32/execution_instances")
|
||||
if os.path.exists("./EasySpider_Windows_x32/config.json"):
|
||||
os.remove("./EasySpider_Windows_x32/config.json")
|
||||
if os.path.exists("./EasySpider_Windows_x32/mysql_config.json"):
|
||||
os.remove("./EasySpider_Windows_x32/mysql_config.json")
|
||||
if os.path.exists("./EasySpider_Windows_x32/TempUserDataFolder"):
|
||||
shutil.rmtree("./EasySpider_Windows_x32/TempUserDataFolder")
|
||||
os.mkdir("./EasySpider_Windows_x32/Data")
|
||||
os.mkdir("./EasySpider_Windows_x32/execution_instances")
|
||||
# compress_folder_to_7z_split("./EasySpider_Windows_x32", file_name)
|
||||
# print(f"Compress {file_name} Split successfully!")
|
||||
compress_folder_to_7z("./EasySpider_windows_x32", file_name)
|
||||
compress_folder_to_7z("./EasySpider_Windows_x32", file_name)
|
||||
print(f"Compress {file_name} successfully!")
|
||||
elif sys.platform == "linux" and platform.architecture()[0] == "64bit":
|
||||
file_name = f"EasySpider_{easyspider_version}_Linux_x64.tar.xz"
|
||||
|
BIN
ElectronJS/EasySpider_en.crx
Normal file
BIN
ElectronJS/EasySpider_en.crx
Normal file
Binary file not shown.
BIN
ElectronJS/EasySpider_zh.crx
Normal file
BIN
ElectronJS/EasySpider_zh.crx
Normal file
Binary file not shown.
@ -1,4 +1,8 @@
|
||||
# 环境编译说明|Environment Compilation Instruction
|
||||
## 视频教程
|
||||
|
||||
[从源代码编译程序并设计运行和调试任务指南(基于Ubuntu24.04)](https://www.bilibili.com/video/BV1VE421P7yj/)
|
||||
|
||||
# 环境编译说明 | Environment Compilation Instruction
|
||||
|
||||
EasySpider分三部分:
|
||||
|
||||
@ -19,35 +23,35 @@ EasySpider is divided into three parts:
|
||||
This section covers the compilation instructions for the `main program`.
|
||||
|
||||
|
||||
## 建议编译顺序|Suggested Compilation Order
|
||||
## 建议编译顺序 | Suggested Compilation Order
|
||||
|
||||
1. 编译浏览器扩展,否则在主程序执行时会提示找不到`EasySpider_zh.crx`的错误。
|
||||
2. 编译主程序,此时主程序可以正常运行,但无法执行任务,只能设计任务。
|
||||
3. 编译执行阶段程序,否则无法执行程序,只能设计程序。
|
||||
3. 编译执行阶段程序,否则无法执行任务,只能设计任务。
|
||||
|
||||
-----
|
||||
|
||||
1. Compile the browser extension, otherwise an error will be prompted when the main program is executed that `EasySpider_en.crx` cannot be found.
|
||||
2. Compile the main program, at this time the main program can run normally, but can not execute the task, can only design the task.
|
||||
3. Compile the execution stage program, otherwise the program cannot be executed, can only design the program.
|
||||
3. Compile the execution stage program, otherwise the task cannot be executed, can only design the task.
|
||||
|
||||
## 注意事项|Note
|
||||
## 注意事项 | Note
|
||||
|
||||
请记住,每当EasySpider扩展程序和执行程序更新时,都要更新`EasySpider.crx`和`easyspider_executestage`文件。
|
||||
|
||||
Remember to update the `EasySpider.crx` and `easyspider_executestage` files whenever the EasySpider extension and execution program are updated.
|
||||
|
||||
## 环境构建|Environment Setup
|
||||
## 环境构建 | Environment Setup
|
||||
|
||||
以下以Windows x64版本为例。
|
||||
|
||||
Taking the example of Windows x64 version.
|
||||
|
||||
### 浏览器和驱动|Browser and Driver
|
||||
### 浏览器和驱动 | Browser and Driver
|
||||
|
||||
实在搞不定本节的情况下,下载一个直接能用的EasySpider,并把文件夹内的`EasySpider\resources\app\chrome_win64`文件夹拷贝到此`ElectronJS`文件夹下即可。
|
||||
实在搞不定本节的情况下,下载一个直接能用的EasySpider,并把文件夹内的`EasySpider\resources\app\chrome_win64`文件夹拷贝到此`ElectronJS`文件夹下,并把`chrome_win64`文件夹下的`execute.sh`在原文件夹下复制一份并命名为`execute_win64.sh`即可。
|
||||
|
||||
If you're unable to handle the tasks in this section, you can download a ready-to-use EasySpider. Simply copy the `EasySpider\resources\app\chrome_win64` folder from the downloaded files and paste it into the ElectronJS folder.
|
||||
If you're unable to handle the tasks in this section, you can download a ready-to-use EasySpider, and copy the `EasySpider\resources\app\chrome_win64` folder to this `ElectronJS` folder, then copy the `execute.sh` script found in the `chrome_win64` folder and rename it as `execute_win64.sh` in the same location.
|
||||
|
||||
------
|
||||
|
||||
@ -66,7 +70,7 @@ chrome_linux64/ # for linux x64
|
||||
chrome_mac64/ # for mac x64
|
||||
```
|
||||
|
||||
然后,从下面的页面下载和**自己安装的Chrome版本一致**的Chromedriver:[https://chromedriver.chromium.org/downloads](https://chromedriver.chromium.org/downloads),把chromedriver放入刚刚的`chrome`文件夹内,并更名为下面的格式:
|
||||
然后,从下面的页面下载和**自己安装的Chrome版本一致**的Chromedriver:[https://googlechromelabs.github.io/chrome-for-testing/](https://googlechromelabs.github.io/chrome-for-testing/),把chromedriver放入刚刚的`chrome`文件夹内,并更名为下面的格式:
|
||||
|
||||
```
|
||||
chromedriver_win32.exe # for windows x32
|
||||
@ -77,7 +81,7 @@ chromedriver_mac64 # for mac x64
|
||||
|
||||
例如,如果您想在Windows x64平台上构建此软件,那么您首先需要下载适用于Windows x64的Chrome浏览器,并将整个`chrome`文件夹复制到`ElectronJS`文件夹中,然后将文件夹重命名为`chrome_win64`。假设您下载的Chrome版本是110。接下来,下载一个适用于Windows x64的110版本的ChromeDriver,并将其放入`chrome_win64`文件夹中,然后将其重命名为`chromedriver_win64.exe`。
|
||||
|
||||
最后,把此文件夹内的`stealth.min.js`和`execute.bat`文件拷贝入`chrome`文件夹内。
|
||||
最后,把此`ElectronJS`文件夹内的`stealth.min.js`和`execute_win64.bat`文件拷贝入`chrome_win64`文件夹内,**这一步不要忘**。
|
||||
|
||||
|
||||
Download a Chrome from the Internet: https://www.google.com/chrome/, and then put them into this folder, with name format of the following:
|
||||
@ -100,33 +104,31 @@ chromedriver_mac64 # for mac x64
|
||||
|
||||
For example, if you want to build this software on Windows x64 platform, then you should first download a Chrome for Windows x64, then copy the whole `chrome` folder to this `ElectronJS` folder and rename the folder to `chrome_win64`, assume the Chrome version you downloaded is 110; then, download a `chromedriver.exe` with version 110 for Windows x64, and put it into the `chrome_win64` folder, then rename it to `chromedriver_win64.exe`.
|
||||
|
||||
Finally, copy the `stealth.min.js` and `execute.bat` (for Windows x64) file in this folder to these `chrome` folders.
|
||||
Finally, copy the `stealth.min.js` and `execute_win64.bat` file in this `ElectronJS` folder to the `chrome_win64` folder **(do not forget this step)**.
|
||||
|
||||
### NodeJS环境|NodeJS Environment
|
||||
### NodeJS环境 | NodeJS Environment
|
||||
|
||||
1. Windows环境下需要先安装`VS Build Tools 2017` ([https://aka.ms/vs/15/release/vs_buildtools.exe](https://aka.ms/vs/15/release/vs_buildtools.exe))的`Visual C++ Build Tools`组件,不然下面的命令无法执行,其他系统不需要。
|
||||
1. Windows环境下需要先下载`VS Build Tools 2017` ([https://aka.ms/vs/15/release/vs_buildtools.exe](https://aka.ms/vs/15/release/vs_buildtools.exe))并勾选安装其中的`Visual C++ Build Tools(Visual C++生成工具)`组件以便`node-gyp`模块来安装`node-windows-manager`,不然下面的命令无法执行,其他系统不需要。同时,`Python3`也需要安装在系统中并配置好环境变量。
|
||||
2. 安装`NodeJS`:[https://nodejs.org/zh-cn/download/](https://nodejs.org/zh-cn/download/)。
|
||||
3. 运行下面的命令来安装依赖:
|
||||
|
||||
```
|
||||
npm install
|
||||
npm install @electron-forge/cli -g
|
||||
```
|
||||
|
||||
如果上面的命令运行速度很慢可以参考NodeJS换源说明:[https://blog.csdn.net/qq_23211463/article/details/123769061](https://blog.csdn.net/qq_23211463/article/details/123769061)。
|
||||
如果上面的命令运行速度很慢可以参考使用NodeJS和Electron包的换源说明来加速安装:[https://blog.csdn.net/qq_38463737/article/details/140277803](https://blog.csdn.net/qq_38463737/article/details/140277803)。
|
||||
|
||||
-----
|
||||
|
||||
1. On Windows, you need to install `VS Build Tools 2017` (https://aka.ms/vs/15/release/vs_buildtools.exe, select and install the `Visual C++ Build Tools` component) first for node-gyp to install `node-windows-manager` (No need for other OS).
|
||||
1. On Windows, you need to download `VS Build Tools 2017` (https://aka.ms/vs/15/release/vs_buildtools.exe, select and install the `Visual C++ Build Tools` component) first for the module `node-gyp` to install `node-windows-manager` (No need for other OS). Meanwhile, `Python3` needs to be installed and the environment variables need to be configured.
|
||||
2. Install `NodeJS`: [https://nodejs.org/en/download/](https://nodejs.org/en/download/).
|
||||
3. Run the following commands to install NodeJS packages:
|
||||
|
||||
```
|
||||
npm install
|
||||
npm install @electron-forge/cli -g
|
||||
```
|
||||
|
||||
## 运行说明|Run Instruction
|
||||
## 运行说明 | Run Instruction
|
||||
|
||||
在当前文件夹执行以下命令即可在开发模式下运行程序:
|
||||
|
||||
@ -146,25 +148,23 @@ npm run start_direct
|
||||
|
||||
But so far can only design the task, can not execute the task, want to execute the task also need to complete the 'ExecuteStage' folder of the execution of the task program compilation instructions can be executed.
|
||||
|
||||
## 打包发布说明|Package Instruction
|
||||
## 打包发布说明 | Package Instruction
|
||||
|
||||
打包发布前,确保执行阶段程序`easyspider_executestage(.exe)`已放入`chrome(_win64)`文件夹内,且浏览器插件`EasySpider_zh.crx`已经是最新版本。
|
||||
|
||||
执行下面的命令即可打包:
|
||||
执行下面的命令即可打包(需要安装`Git`):
|
||||
|
||||
```
|
||||
npx electron-forge import
|
||||
npm run package
|
||||
```
|
||||
|
||||
-----
|
||||
|
||||
Before packaging and releasing, make sure that the task execution program `easyspider_executestage(.exe)` is placed inside the `chrome(_win64)` folder and that the browser extension `EasySpider_en.crx` is the latest version.
|
||||
Before packaging and releasing, make sure that the task execution program `easyspider_executestage(.exe)` is placed inside the `chrome(_win64)` folder and that the browser extension `EasySpider_en.crx` is the latest version.
|
||||
|
||||
After finishing developing, package software by the following command:
|
||||
After finishing developing, package software by the following command (`Git` is required):
|
||||
|
||||
```
|
||||
npx electron-forge import
|
||||
npm run package
|
||||
```
|
||||
|
||||
@ -186,8 +186,43 @@ package_win64.cmd
|
||||
clean_and_release_win64.cmd
|
||||
```
|
||||
|
||||
### (可选)编译成安装包|(Optional) Compile to an installation package
|
||||
## 可能出现的问题 | Troubleshooting
|
||||
|
||||
以下命令一般不需要执行,但打包时可能会用到:
|
||||
|
||||
```sh
|
||||
npm install @electron-forge/cli -g
|
||||
npx electron-forge import
|
||||
```
|
||||
npm run make
|
||||
```
|
||||
|
||||
如果任务执行到`npm install electron-squirrel-startup`的步骤时卡死,请参考下面的换源教程:[https://blog.csdn.net/qq_38463737/article/details/140277803](https://blog.csdn.net/qq_38463737/article/details/140277803)。
|
||||
|
||||
Windows端如果在运行`npm run package`的时候提示`node-gyp`相关的错误,可以安装`electron-rebuild`并重新编译相关模块:
|
||||
|
||||
```sh
|
||||
npm install --save-dev electron-rebuild
|
||||
npx electron-rebuild
|
||||
```
|
||||
|
||||
然后再次运行`npm run package`。
|
||||
|
||||
-----
|
||||
|
||||
The following commands are generally not required, but may be used during packaging:
|
||||
|
||||
```sh
|
||||
npm install @electron-forge/cli -g
|
||||
npx electron-forge import
|
||||
```
|
||||
|
||||
If the task is stuck at the `npm install electron-squirrel-startup` step, please refer to the following tutorial on changing the source: [https://blog.csdn.net/qq_38463737/article/details/140277803](https://blog.csdn.net/qq_38463737/article/details/140277803).
|
||||
|
||||
If you encounter `node-gyp` related errors when running `npm run package` on Windows, you can install `electron-rebuild` and recompile the relevant modules:
|
||||
|
||||
```sh
|
||||
npm install --save-dev electron-rebuild
|
||||
npx electron-rebuild
|
||||
```
|
||||
|
||||
Then run `npm run package` again.
|
||||
|
||||
|
@ -30,7 +30,7 @@ def update_file_version(file_path, new_version, key="当前版本/Current Versio
|
||||
file.write(line)
|
||||
|
||||
|
||||
version = "0.6.0"
|
||||
version = "0.6.3"
|
||||
|
||||
# py html js
|
||||
|
||||
@ -47,7 +47,8 @@ if __name__ == "__main__":
|
||||
|
||||
# index.html
|
||||
file_path = "./src/index.html"
|
||||
update_file_version(file_path, version, key="当前版本/Current Version: <b>v")
|
||||
update_file_version(file_path, version, key="软件当前版本:<b>v")
|
||||
update_file_version(file_path, version, key="Current Version: <b>v")
|
||||
|
||||
# package.json
|
||||
file_path = "./package.json"
|
||||
|
@ -11,9 +11,10 @@ del out\EasySpider\resources\app\vs_BuildTools.exe
|
||||
move out\EasySpider ..\.temp_to_pub\EasySpider_windows_x32\EasySpider
|
||||
rmdir /s /q ..\.temp_to_pub\EasySpider_windows_x32\Code
|
||||
mkdir ..\.temp_to_pub\EasySpider_windows_x32\Code
|
||||
copy ..\ExecuteStage\easyspider_executestage.py ..\.temp_to_pub\EasySpider_windows_x32\Code
|
||||
copy ..\ExecuteStage\myChrome.py ..\.temp_to_pub\EasySpider_windows_x32\Code
|
||||
copy ..\ExecuteStage\utils.py ..\.temp_to_pub\EasySpider_windows_x32\Code
|
||||
@REM copy ..\ExecuteStage\easyspider_executestage.py ..\.temp_to_pub\EasySpider_windows_x32\Code
|
||||
@REM copy ..\ExecuteStage\myChrome.py ..\.temp_to_pub\EasySpider_windows_x32\Code
|
||||
@REM copy ..\ExecuteStage\utils.py ..\.temp_to_pub\EasySpider_windows_x32\Code
|
||||
copy ..\ExecuteStage\*.py ..\.temp_to_pub\EasySpider_windows_x32\Code
|
||||
copy ..\ExecuteStage\requirements.txt ..\.temp_to_pub\EasySpider_windows_x32\Code
|
||||
copy ..\ExecuteStage\Readme.md ..\.temp_to_pub\EasySpider_windows_x32\Code
|
||||
copy ..\ExecuteStage\myCode.py ..\.temp_to_pub\EasySpider_windows_x32
|
||||
|
@ -11,9 +11,10 @@ del out\EasySpider\resources\app\vs_BuildTools.exe
|
||||
move out\EasySpider ..\.temp_to_pub\EasySpider_windows_x64\EasySpider
|
||||
rmdir /s /Q ..\.temp_to_pub\EasySpider_windows_x64\Code
|
||||
mkdir ..\.temp_to_pub\EasySpider_windows_x64\Code
|
||||
copy ..\ExecuteStage\easyspider_executestage.py ..\.temp_to_pub\EasySpider_windows_x64\Code
|
||||
copy ..\ExecuteStage\myChrome.py ..\.temp_to_pub\EasySpider_windows_x64\Code
|
||||
copy ..\ExecuteStage\utils.py ..\.temp_to_pub\EasySpider_windows_x64\Code
|
||||
@REM copy ..\ExecuteStage\easyspider_executestage.py ..\.temp_to_pub\EasySpider_windows_x64\Code
|
||||
@REM copy ..\ExecuteStage\myChrome.py ..\.temp_to_pub\EasySpider_windows_x64\Code
|
||||
@REM copy ..\ExecuteStage\utils.py ..\.temp_to_pub\EasySpider_windows_x64\Code
|
||||
copy ..\ExecuteStage\*.py ..\.temp_to_pub\EasySpider_windows_x64\Code
|
||||
copy ..\ExecuteStage\requirements.txt ..\.temp_to_pub\EasySpider_windows_x64\Code
|
||||
copy ..\ExecuteStage\Readme.md ..\.temp_to_pub\EasySpider_windows_x64\Code
|
||||
copy ..\ExecuteStage\myCode.py ..\.temp_to_pub\EasySpider_windows_x64
|
||||
|
@ -1 +1 @@
|
||||
{"webserver_address":"http://localhost","webserver_port":8074,"user_data_folder":"./user_data","debug":false,"copyright":1,"sys_version":"x64","mysql_config_path":"./mysql_config.json","absolute_user_data_folder":"/Users/naibo/Documents/EasySpider/ElectronJS/user_data"}
|
||||
{"webserver_address":"http://localhost","webserver_port":8074,"user_data_folder":"./user_data","debug":false,"copyright":1,"sys_version":"x64","mysql_config_path":"./mysql_config.json","absolute_user_data_folder":"D:\\Documents\\Projects\\EasySpider\\ElectronJS\\user_data","lang":"zh"}
|
@ -50,7 +50,9 @@ if (config.debug) {
|
||||
}
|
||||
let allWindowSockets = [];
|
||||
let allWindowScoketNames = [];
|
||||
task_server.start(config.webserver_port); //start local server
|
||||
if(config.webserver_address.includes("localhost") || config.webserver_address.includes("127.0.0.1")) {
|
||||
task_server.start(config.webserver_port); //start local server
|
||||
}
|
||||
let server_address = `${config.webserver_address}:${config.webserver_port}`;
|
||||
const websocket_port = 8084; //目前只支持8084端口,写死,因为扩展里面写死了
|
||||
console.log("server_address: " + server_address);
|
||||
@ -84,11 +86,11 @@ console.log(process.arch);
|
||||
if (process.platform === "win32" && process.arch === "ia32") {
|
||||
driverPath = path.join(__dirname, "chrome_win32/chromedriver_win32.exe");
|
||||
chromeBinaryPath = path.join(__dirname, "chrome_win32/chrome.exe");
|
||||
execute_path = path.join(__dirname, "chrome_win32/execute.bat");
|
||||
execute_path = path.join(__dirname, "chrome_win32/execute_win32.bat");
|
||||
} else if (process.platform === "win32" && process.arch === "x64") {
|
||||
driverPath = path.join(__dirname, "chrome_win64/chromedriver_win64.exe");
|
||||
chromeBinaryPath = path.join(__dirname, "chrome_win64/chrome.exe");
|
||||
execute_path = path.join(__dirname, "chrome_win64/execute.bat");
|
||||
execute_path = path.join(__dirname, "chrome_win64/execute_win64.bat");
|
||||
} else if (process.platform === "darwin") {
|
||||
driverPath = path.join(__dirname, "chromedriver_mac64");
|
||||
chromeBinaryPath = path.join(
|
||||
@ -99,7 +101,7 @@ if (process.platform === "win32" && process.arch === "ia32") {
|
||||
} else if (process.platform === "linux") {
|
||||
driverPath = path.join(__dirname, "chrome_linux64/chromedriver_linux64");
|
||||
chromeBinaryPath = path.join(__dirname, "chrome_linux64/chrome");
|
||||
execute_path = path.join(__dirname, "chrome_linux64/execute.sh");
|
||||
execute_path = path.join(__dirname, "chrome_linux64/execute_linux64.sh");
|
||||
}
|
||||
console.log(driverPath, chromeBinaryPath, execute_path);
|
||||
let language = "en";
|
||||
@ -112,6 +114,7 @@ let handle_pairs = {};
|
||||
let socket_window = null;
|
||||
let socket_start = null;
|
||||
let socket_flowchart = null;
|
||||
let socket_popup = null;
|
||||
let invoke_window = null;
|
||||
|
||||
// var ffi = require('ffi-napi');
|
||||
@ -148,8 +151,8 @@ function createWindow() {
|
||||
server_address +
|
||||
"/index.html?user_data_folder=" +
|
||||
config.user_data_folder +
|
||||
"©right=" +
|
||||
config.copyright,
|
||||
"©right=" + config.copyright +
|
||||
"&lang=" + config.lang,
|
||||
{extraHeaders: "pragma: no-cache\n"}
|
||||
);
|
||||
// 隐藏菜单栏
|
||||
@ -160,9 +163,8 @@ function createWindow() {
|
||||
app.quit();
|
||||
}
|
||||
});
|
||||
//调试模式
|
||||
// mainWindow.webContents.openDevTools();
|
||||
// Open the DevTools.
|
||||
// mainWindow.webContents.openDevTools()
|
||||
}
|
||||
|
||||
async function findElementRecursive(driver, by, value, frames) {
|
||||
@ -243,6 +245,7 @@ async function findElementAcrossAllWindows(
|
||||
let handles = await driver.getAllWindowHandles();
|
||||
// console.log("handles", handles);
|
||||
let content_handle = current_handle;
|
||||
let old_handle = current_handle;
|
||||
let id = -1;
|
||||
try {
|
||||
id = msg.message.id;
|
||||
@ -289,12 +292,12 @@ async function findElementAcrossAllWindows(
|
||||
xpath = msg.xpath;
|
||||
}
|
||||
}
|
||||
if (xpath.indexOf("Field(") >= 0 || xpath.indexOf("eval(") >= 0) {
|
||||
if (xpath.indexOf("Field[") >= 0 || xpath.indexOf("eval(") >= 0) {
|
||||
//两秒后通知浏览器
|
||||
await new Promise((resolve) => setTimeout(resolve, 2000));
|
||||
notify_browser(
|
||||
'检测到XPath中包含Field("")或eval(""),试运行时无法正常定位到包含此两项表达式的元素,请在任务正式运行阶段测试是否有效。',
|
||||
'Field("") or eval("") is detected in xpath, and the element containing these two expressions cannot be located normally during trial operation. Please test whether it is valid in the formal call stage.',
|
||||
'检测到XPath中包含Field[""]或eval(""),试运行时无法正常定位到包含此两项表达式的元素,请在任务正式运行阶段测试是否有效。',
|
||||
'Field[""] or eval("") is detected in xpath, and the element containing these two expressions cannot be located normally during trial operation. Please test whether it is valid in the formal call stage.',
|
||||
"warning"
|
||||
);
|
||||
return null;
|
||||
@ -308,7 +311,7 @@ async function findElementAcrossAllWindows(
|
||||
if (h != null && handles.includes(h)) {
|
||||
await driver.switchTo().window(h);
|
||||
current_handle = h;
|
||||
console.log("switch to handle: ", h);
|
||||
console.log("Switch to handle: ", h);
|
||||
}
|
||||
element = await findElement(driver, By.xpath, xpath, iframe);
|
||||
break;
|
||||
@ -325,6 +328,12 @@ async function findElementAcrossAllWindows(
|
||||
}
|
||||
}
|
||||
if (element == null && notifyBrowser) {
|
||||
// 如果找不到元素,切换回原来的窗口
|
||||
if (old_handle != null && handles.includes(old_handle)) {
|
||||
await driver.switchTo().window(old_handle);
|
||||
current_handle = old_handle;
|
||||
console.log("Switch to handle: ", old_handle);
|
||||
}
|
||||
notify_browser(
|
||||
"无法找到元素,请检查XPath是否正确:" + xpath,
|
||||
"Cannot find the element, please check if the XPath is correct: " + xpath,
|
||||
@ -651,7 +660,15 @@ async function beginInvoke(msg, ws) {
|
||||
if (parameters.xpath.includes("point(")) {
|
||||
await click_element(element, point);
|
||||
} else {
|
||||
await click_element(element);
|
||||
if (parameters.clickWay == 2){ //双击
|
||||
await click_element(element, "double");
|
||||
} else {
|
||||
if (parameters.newTab == 1){
|
||||
await click_element(element, "loopClickEvery"); //新标签页打开
|
||||
} else {
|
||||
await click_element(element); //单击
|
||||
}
|
||||
}
|
||||
}
|
||||
let alertHandleType = parameters.alertHandleType;
|
||||
if (alertHandleType == 1) {
|
||||
@ -757,12 +774,12 @@ async function beginInvoke(msg, ws) {
|
||||
keyInfo = keyInfo.replace(match[0], jsReplacedText.toString());
|
||||
}
|
||||
}
|
||||
if (keyInfo.indexOf("Field(") >= 0 || keyInfo.indexOf("eval(") >= 0) {
|
||||
if (keyInfo.indexOf("Field[") >= 0 || keyInfo.indexOf("eval(") >= 0) {
|
||||
//两秒后通知浏览器
|
||||
await new Promise((resolve) => setTimeout(resolve, 2000));
|
||||
notify_browser(
|
||||
'检测到文字中包含Field("")或eval(""),试运行时无法输入两项表达式的替换值,请在任务正式运行阶段测试是否有效。',
|
||||
'Field("") or eval("") is detected in the text, and the replacement value of the two expressions cannot be entered during trial operation. Please test whether it is valid in the formal call stage.',
|
||||
'检测到文字中包含Field[""]或eval(""),试运行时无法输入两项表达式的替换值,请在任务正式运行阶段测试是否有效。',
|
||||
'Field[""] or eval("") is detected in the text, and the replacement value of the two expressions cannot be entered during trial operation. Please test whether it is valid in the formal call stage.',
|
||||
"warning"
|
||||
);
|
||||
}
|
||||
@ -787,7 +804,41 @@ async function beginInvoke(msg, ws) {
|
||||
let waitTime = parameters.waitTime;
|
||||
let element = await driver.findElement(By.tagName("body"));
|
||||
if (codeMode == 0) {
|
||||
await execute_js(code, element, waitTime);
|
||||
let result = await execute_js(code, element, waitTime);
|
||||
let level = "success";
|
||||
if (result == -1) {
|
||||
level = "info";
|
||||
}
|
||||
if (result != null) {
|
||||
notify_browser(
|
||||
"JavaScript操作返回结果:" + result,
|
||||
"JavaScript operation returns result: " + result,
|
||||
level
|
||||
);
|
||||
}
|
||||
} else if (codeMode == 2) { // 循环内的JS代码
|
||||
let parent_node = JSON.parse(msg.message.parentNode);
|
||||
let parent_xpath = parent_node.parameters.xpath;
|
||||
if (parent_node.parameters.loopType == 2) {
|
||||
parent_xpath = parent_node.parameters.pathList
|
||||
.split("\n")[0]
|
||||
.trim();
|
||||
}
|
||||
let elementInfo = {iframe: parameters.iframe, xpath: parent_xpath, id: -1};
|
||||
let element = await findElementAcrossAllWindows(
|
||||
elementInfo, notifyBrowser = false); //通过此函数找到元素并切换到对应的窗口
|
||||
let result = await execute_js(code, element, waitTime);
|
||||
let level = "success";
|
||||
if (result == -1) {
|
||||
level = "info";
|
||||
}
|
||||
if (result != null) {
|
||||
notify_browser(
|
||||
"JavaScript操作返回结果:" + result,
|
||||
"JavaScript operation returns result: " + result,
|
||||
level
|
||||
);
|
||||
}
|
||||
} else if (codeMode == 8) {
|
||||
//刷新页面
|
||||
try {
|
||||
@ -859,11 +910,11 @@ async function beginInvoke(msg, ws) {
|
||||
execute_js(afterJS, element, afterJSWaitTime);
|
||||
} else if (option == 11) {
|
||||
//单个提取数据参数
|
||||
notify_browser(
|
||||
"提示:提取数据操作只能试运行设置的JavaScript语句,且只针对第一个匹配的元素。",
|
||||
"Hint: can only test JavaScript statement set in the data extraction operation, and only for the first matching element.",
|
||||
"info"
|
||||
);
|
||||
// notify_browser(
|
||||
// "提示:提取数据字段的试运行操作只针对第一个匹配的元素。",
|
||||
// "Hint: can only test the trial operation of the data extraction field for the first matching element.",
|
||||
// "info"
|
||||
// );
|
||||
let params = parameters.params; //所有的提取数据参数
|
||||
let i = parameters.index;
|
||||
let param = params[i];
|
||||
@ -879,12 +930,111 @@ async function beginInvoke(msg, ws) {
|
||||
xpath = parent_xpath + xpath;
|
||||
}
|
||||
let elementInfo = {iframe: param.iframe, xpath: xpath, id: -1};
|
||||
let element = await findElementAcrossAllWindows(
|
||||
elementInfo,
|
||||
(notifyBrowser = false)
|
||||
);
|
||||
let element = await findElementAcrossAllWindows(elementInfo);
|
||||
if (element != null) {
|
||||
await execute_js(param.beforeJS, element, param.beforeJSWaitTime);
|
||||
if (param.contentType == 0) {
|
||||
let result = await element.getText(); // 获取元素及其子元素的文本内容
|
||||
if (param.nodeType == 2) { //链接地址
|
||||
result = await element.getAttribute("href");
|
||||
notify_browser("获取的链接地址:" + result, "Link URL obtained: " + result, "success")
|
||||
} else if (param.nodeType == 3) { //表单值
|
||||
result = await element.getAttribute("value");
|
||||
notify_browser("获取的表单值:" + result, "Form value obtained: " + result, "success")
|
||||
} else if (param.nodeType == 4) { //图片地址
|
||||
result = await element.getAttribute("src");
|
||||
notify_browser("获取的图片地址:" + result, "Image URL obtained: " + result, "success")
|
||||
} else {
|
||||
notify_browser("获取的文本内容:" + result, "Text content obtained: " + result, "success");
|
||||
}
|
||||
} else if (param.contentType == 1) {
|
||||
// 对于Selenium,获取不包括子元素的文本可能需要特殊处理,这里假设element是父元素
|
||||
let command = 'var arr = [];\
|
||||
var content = arguments[0];\
|
||||
for(var i = 0, len = content.childNodes.length; i < len; i++) {\
|
||||
if(content.childNodes[i].nodeType === 3){ \
|
||||
arr.push(content.childNodes[i].nodeValue);\
|
||||
}\
|
||||
}\
|
||||
var str = arr.join(" "); \
|
||||
return str;'
|
||||
let result = await execute_js(command, element, 0);
|
||||
result = result.replace(/\n/g, "").replace(/\s+/g, " ");
|
||||
notify_browser("获取的内容:" + result, "Content obtained: " + result, "success");
|
||||
} else if (param.contentType == 2) {
|
||||
let result = await element.getAttribute('innerHTML'); // 获取元素的内部HTML内容
|
||||
notify_browser("获取的innerHTML:" + result, "innerHTML obtained: " + result, "success");
|
||||
} else if (param.contentType == 3) {
|
||||
let result = await element.getAttribute('outerHTML'); // 获取元素及其内容的HTML表示
|
||||
notify_browser("获取的outerHTML:" + result, "outerHTML obtained: " + result, "success");
|
||||
} else if (param.contentType == 4) {
|
||||
let result = await element.getCssValue('background-image'); // 获取元素的背景图片地址
|
||||
notify_browser("获取的背景图片地址:" + result, "Background image URL obtained: " + result, "success");
|
||||
} else if (param.contentType == 5) {
|
||||
let result = await driver.getCurrentUrl(); // 获取页面的网址
|
||||
notify_browser("获取的页面网址:" + result, "Page URL obtained: " + result, "success");
|
||||
} else if (param.contentType == 6) { //页面标题
|
||||
let result = await driver.getTitle();
|
||||
notify_browser("获取的页面标题:" + result, "Page title obtained: " + result, "success");
|
||||
} else if (param.contentType == 9) { //针对元素的JavaScript代码返回值
|
||||
let result = await execute_js(param.JS, element);
|
||||
let level = "success";
|
||||
if (result == -1) {
|
||||
level = "info";
|
||||
}
|
||||
if (result != null) {
|
||||
notify_browser(
|
||||
"JavaScript操作返回结果:" + result,
|
||||
"JavaScript operation returns result: " + result,
|
||||
level
|
||||
);
|
||||
}
|
||||
} else if (param.contentType == 10) {
|
||||
// 当前选择框选中的选项值
|
||||
let result = await element.getAttribute("value");
|
||||
notify_browser(
|
||||
"获取的选项值:" + result,
|
||||
"Option value obtained: " + result,
|
||||
"success"
|
||||
);
|
||||
} else if (param.contentType == 11) {
|
||||
// 当前选择框选中的选项文本
|
||||
let selectElement = new Select(element);
|
||||
// 等待选项变得可选,这是可选的,根据页面加载情况
|
||||
await driver.wait(until.elementIsEnabled(element));
|
||||
// 获取当前选中的选项元素
|
||||
let selectedOption = await selectElement.getFirstSelectedOption();
|
||||
// 获取选项的文本内容
|
||||
let content = await selectedOption.getText();
|
||||
notify_browser(
|
||||
"获取的选项文本:" + content,
|
||||
"Option text obtained: " + content,
|
||||
"success"
|
||||
);
|
||||
} else if (param.contentType == 14) {
|
||||
//元素的属性值
|
||||
let result = await element.getAttribute(param.JS);
|
||||
notify_browser(
|
||||
"获取的属性值:" + result,
|
||||
"Attribute value obtained: " + result,
|
||||
"success"
|
||||
);
|
||||
} else if(param.contentType == 15) {
|
||||
//元素的属性值
|
||||
let result = param.JS;
|
||||
notify_browser(
|
||||
"获取的常量值:" + result,
|
||||
"Constant value obtained: " + result,
|
||||
"success"
|
||||
);
|
||||
} else {
|
||||
//其他暂不支持
|
||||
notify_browser(
|
||||
"暂不支持测试此类型的数据提取,请在任务正式运行阶段测试是否有效。",
|
||||
"This type of data extraction is not supported for testing. Please test whether it is valid in the formal call stage.",
|
||||
"warning"
|
||||
);
|
||||
}
|
||||
await execute_js(param.afterJS, element, param.afterJSWaitTime);
|
||||
}
|
||||
}
|
||||
@ -982,18 +1132,41 @@ async function beginInvoke(msg, ws) {
|
||||
} catch {
|
||||
console.log("Cannot get Cookies");
|
||||
}
|
||||
} else if (msg.type == 30) {
|
||||
send_message_to_browser(
|
||||
JSON.stringify({
|
||||
type: "showAllToolboxes"
|
||||
})
|
||||
);
|
||||
console.log("Show all toolboxes");
|
||||
} else if (msg.type == 31) {
|
||||
send_message_to_browser(
|
||||
JSON.stringify({
|
||||
type: "hideAllToolboxes"
|
||||
})
|
||||
);
|
||||
console.log("Hide all toolboxes");
|
||||
}
|
||||
}
|
||||
|
||||
async function click_element(element, type = "click") {
|
||||
try {
|
||||
if (type == "loopClickEvery") {
|
||||
await driver
|
||||
if (process.platform === "darwin") {
|
||||
await driver
|
||||
.actions()
|
||||
.keyDown(Key.COMMAND)
|
||||
.click(element)
|
||||
.keyUp(Key.COMMAND)
|
||||
.perform();
|
||||
} else {
|
||||
await driver
|
||||
.actions()
|
||||
.keyDown(Key.CONTROL)
|
||||
.click(element)
|
||||
.keyUp(Key.CONTROL)
|
||||
.perform();
|
||||
}
|
||||
} else if (type.includes("point(")) {
|
||||
//point(10, 20)表示点击坐标为(10, 20)的位置
|
||||
let point = type.substring(6, type.length - 1).split(",");
|
||||
@ -1005,6 +1178,8 @@ async function click_element(element, type = "click") {
|
||||
// await actions.click().perform();
|
||||
let script = `document.elementFromPoint(${x}, ${y}).click();`;
|
||||
await driver.executeScript(script);
|
||||
} else if (type == "double") {
|
||||
await driver.actions().doubleClick(element).perform();
|
||||
} else {
|
||||
await element.click();
|
||||
}
|
||||
@ -1038,12 +1213,12 @@ async function execute_js(js, element, wait_time = 3) {
|
||||
);
|
||||
outcome = -1;
|
||||
}
|
||||
if (js.indexOf("Field(") >= 0 || js.indexOf("eval(") >= 0) {
|
||||
if (js.indexOf("Field[") >= 0 || js.indexOf("eval(") >= 0) {
|
||||
//两秒后通知浏览器
|
||||
await new Promise((resolve) => setTimeout(resolve, 2000));
|
||||
notify_browser(
|
||||
'检测到JavaScript中包含Field("")或eval(""),试运行时无法执行两项表达式,请在任务正式运行阶段测试是否有效。',
|
||||
'Field("") or eval("") is detected in JavaScript, and the two expressions cannot be executed during trial operation. Please test whether it is valid in the formal call stage.',
|
||||
'检测到JavaScript中包含Field[""]或eval(""),试运行时无法执行两项表达式,请在任务正式运行阶段测试是否有效。',
|
||||
'Field[""] or eval("") is detected in JavaScript, and the two expressions cannot be executed during trial operation. Please test whether it is valid in the formal call stage.',
|
||||
"warning"
|
||||
);
|
||||
}
|
||||
@ -1063,6 +1238,9 @@ function notify_flowchart(msg_zh, msg_en, level = "info") {
|
||||
}
|
||||
|
||||
function notify_browser(msg_zh, msg_en, level = "info") {
|
||||
if (msg_zh.split(":").length > 1 && msg_zh.split(":")[1].includes("null")) {
|
||||
level = "warning";
|
||||
}
|
||||
send_message_to_browser(
|
||||
JSON.stringify({
|
||||
type: "notify",
|
||||
@ -1111,6 +1289,9 @@ wss.on("connection", function (ws) {
|
||||
// console.log("socket_flowchart closed");
|
||||
// });
|
||||
console.log("set socket_flowchart at time: ", new Date());
|
||||
} else if (msg.message.id == 3) {
|
||||
socket_popup = ws;
|
||||
console.log("set socket_popup at time: ", new Date());
|
||||
} else {
|
||||
//其他的ID是用来标识不同的浏览器标签页的
|
||||
// await new Promise(resolve => setTimeout(resolve, 200));
|
||||
@ -1213,6 +1394,8 @@ async function runBrowser(lang = "en", user_data_folder = "", mobile = false) {
|
||||
let options = new chrome.Options();
|
||||
options.addArguments("--disable-blink-features=AutomationControlled");
|
||||
options.addArguments("--disable-infobars");
|
||||
options.addArguments("--disable-web-security");
|
||||
options.addArguments("--disable-features=CrossSiteDocumentBlockingIfIsolating,CrossSiteDocumentBlockingAlways,IsolateOrigins,site-per-process");
|
||||
// 添加实验性选项以排除'enable-automation'开关
|
||||
options.set("excludeSwitches", ["enable-automation"]);
|
||||
options.excludeSwitches("enable-automation");
|
||||
@ -1399,6 +1582,17 @@ app.whenReady().then(() => {
|
||||
path.join(task_server.getDir(), "config.json"),
|
||||
JSON.stringify(config)
|
||||
);
|
||||
//重新读取配置文件
|
||||
config = JSON.parse(fs.readFileSync(path.join(task_server.getDir(), "config.json")));
|
||||
});
|
||||
ipcMain.on("change-lang", function (event, arg) {
|
||||
config.lang = arg;
|
||||
fs.writeFileSync(
|
||||
path.join(task_server.getDir(), "config.json"),
|
||||
JSON.stringify(config)
|
||||
);
|
||||
//重新读取配置文件
|
||||
config = JSON.parse(fs.readFileSync(path.join(task_server.getDir(), "config.json")));
|
||||
});
|
||||
createWindow();
|
||||
|
||||
|
348
ElectronJS/package-lock.json
generated
348
ElectronJS/package-lock.json
generated
@ -1,24 +1,24 @@
|
||||
{
|
||||
"name": "easy-spider",
|
||||
"version": "0.6.0",
|
||||
"version": "0.6.3",
|
||||
"lockfileVersion": 3,
|
||||
"requires": true,
|
||||
"packages": {
|
||||
"": {
|
||||
"name": "easy-spider",
|
||||
"version": "0.6.0",
|
||||
"version": "0.6.3",
|
||||
"license": "AGPL-3.0",
|
||||
"dependencies": {
|
||||
"cors": "^2.8.5",
|
||||
"electron-squirrel-startup": "^1.0.0",
|
||||
"express": "^4.18.2",
|
||||
"express": "^4.21.2",
|
||||
"formidable": "^3.5.0",
|
||||
"http": "^0.0.1-security",
|
||||
"multer": "^1.4.5-lts.1",
|
||||
"node-abi": "^3.52.0",
|
||||
"node-window-manager": "^2.2.4",
|
||||
"selenium-webdriver": "^4.16.0",
|
||||
"ws": "^8.12.0",
|
||||
"selenium-webdriver": "^4.27.0",
|
||||
"ws": "^8.18.0",
|
||||
"xlsx": "^0.18.5"
|
||||
},
|
||||
"devDependencies": {
|
||||
@ -30,6 +30,11 @@
|
||||
"electron": "^27.1.3"
|
||||
}
|
||||
},
|
||||
"node_modules/@bazel/runfiles": {
|
||||
"version": "6.3.1",
|
||||
"resolved": "https://registry.npmjs.org/@bazel/runfiles/-/runfiles-6.3.1.tgz",
|
||||
"integrity": "sha512-1uLNT5NZsUVIGS4syuHwTzZ8HycMPyr6POA3FCE4GbMtc4rhoJk8aZKtNIRthJYfL+iioppi+rTfH3olMPr9nA=="
|
||||
},
|
||||
"node_modules/@electron-forge/cli": {
|
||||
"version": "6.2.1",
|
||||
"dev": true,
|
||||
@ -1203,6 +1208,7 @@
|
||||
},
|
||||
"node_modules/balanced-match": {
|
||||
"version": "1.0.2",
|
||||
"dev": true,
|
||||
"license": "MIT"
|
||||
},
|
||||
"node_modules/base64-js": {
|
||||
@ -1253,20 +1259,20 @@
|
||||
"license": "MIT"
|
||||
},
|
||||
"node_modules/body-parser": {
|
||||
"version": "1.20.1",
|
||||
"resolved": "https://registry.npmjs.org/body-parser/-/body-parser-1.20.1.tgz",
|
||||
"integrity": "sha512-jWi7abTbYwajOytWCQc37VulmWiRae5RyTpaCyDcS5/lMdtwSz5lOpDE67srw/HYe35f1z3fDQw+3txg7gNtWw==",
|
||||
"version": "1.20.3",
|
||||
"resolved": "https://registry.npmjs.org/body-parser/-/body-parser-1.20.3.tgz",
|
||||
"integrity": "sha512-7rAxByjUMqQ3/bHJy7D6OGXvx/MMc4IqBn/X0fcM1QUcAItpZrBEYhWGem+tzXH90c+G01ypMcYJBO9Y30203g==",
|
||||
"dependencies": {
|
||||
"bytes": "3.1.2",
|
||||
"content-type": "~1.0.4",
|
||||
"content-type": "~1.0.5",
|
||||
"debug": "2.6.9",
|
||||
"depd": "2.0.0",
|
||||
"destroy": "1.2.0",
|
||||
"http-errors": "2.0.0",
|
||||
"iconv-lite": "0.4.24",
|
||||
"on-finished": "2.4.1",
|
||||
"qs": "6.11.0",
|
||||
"raw-body": "2.5.1",
|
||||
"qs": "6.13.0",
|
||||
"raw-body": "2.5.2",
|
||||
"type-is": "~1.6.18",
|
||||
"unpipe": "1.0.0"
|
||||
},
|
||||
@ -1307,6 +1313,7 @@
|
||||
},
|
||||
"node_modules/brace-expansion": {
|
||||
"version": "1.1.11",
|
||||
"dev": true,
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"balanced-match": "^1.0.0",
|
||||
@ -1314,11 +1321,12 @@
|
||||
}
|
||||
},
|
||||
"node_modules/braces": {
|
||||
"version": "3.0.2",
|
||||
"version": "3.0.3",
|
||||
"resolved": "https://registry.npmjs.org/braces/-/braces-3.0.3.tgz",
|
||||
"integrity": "sha512-yQbXgO/OSZVD2IsiLlro+7Hf6Q18EJrKSEsdoMzKePKXct3gvD8oLcOQdIzGupr5Fj+EDe8gO/lxc1BzfMpxvA==",
|
||||
"dev": true,
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"fill-range": "^7.0.1"
|
||||
"fill-range": "^7.1.1"
|
||||
},
|
||||
"engines": {
|
||||
"node": ">=8"
|
||||
@ -1480,12 +1488,18 @@
|
||||
}
|
||||
},
|
||||
"node_modules/call-bind": {
|
||||
"version": "1.0.2",
|
||||
"resolved": "https://registry.npmjs.org/call-bind/-/call-bind-1.0.2.tgz",
|
||||
"integrity": "sha512-7O+FbCihrB5WGbFYesctwmTKae6rOiIzmz1icreWJ+0aA7LJfuqhEso2T9ncpcFtzMQtzXf2QGGueWJGTYsqrA==",
|
||||
"version": "1.0.7",
|
||||
"resolved": "https://registry.npmjs.org/call-bind/-/call-bind-1.0.7.tgz",
|
||||
"integrity": "sha512-GHTSNSYICQ7scH7sZ+M2rFopRoLh8t2bLSW6BbgrtLsahOIB5iyAVJf9GjWK3cYTDaMj4XdBpM1cA6pIS0Kv2w==",
|
||||
"dependencies": {
|
||||
"function-bind": "^1.1.1",
|
||||
"get-intrinsic": "^1.0.2"
|
||||
"es-define-property": "^1.0.0",
|
||||
"es-errors": "^1.3.0",
|
||||
"function-bind": "^1.1.2",
|
||||
"get-intrinsic": "^1.2.4",
|
||||
"set-function-length": "^1.2.1"
|
||||
},
|
||||
"engines": {
|
||||
"node": ">= 0.4"
|
||||
},
|
||||
"funding": {
|
||||
"url": "https://github.com/sponsors/ljharb"
|
||||
@ -1661,6 +1675,7 @@
|
||||
},
|
||||
"node_modules/concat-map": {
|
||||
"version": "0.0.1",
|
||||
"dev": true,
|
||||
"license": "MIT"
|
||||
},
|
||||
"node_modules/concat-stream": {
|
||||
@ -1721,9 +1736,9 @@
|
||||
}
|
||||
},
|
||||
"node_modules/cookie": {
|
||||
"version": "0.5.0",
|
||||
"resolved": "https://registry.npmjs.org/cookie/-/cookie-0.5.0.tgz",
|
||||
"integrity": "sha512-YZ3GUyn/o8gfKJlnlX7g7xq4gyO6OSuhGPKaaGssGB2qgDUS0gPgtTvoyZLTt9Ab6dC4hfc9dV5arkvc/OCmrw==",
|
||||
"version": "0.7.1",
|
||||
"resolved": "https://registry.npmjs.org/cookie/-/cookie-0.7.1.tgz",
|
||||
"integrity": "sha512-6DnInpx7SJ2AK3+CTUE/ZM0vWTUboZCegxhC2xiIydHR9jNuTAASBrfEpHhiGOZw/nX51bHt6YQl8jsGo4y/0w==",
|
||||
"engines": {
|
||||
"node": ">= 0.6"
|
||||
}
|
||||
@ -1898,6 +1913,22 @@
|
||||
"node": ">=10"
|
||||
}
|
||||
},
|
||||
"node_modules/define-data-property": {
|
||||
"version": "1.1.4",
|
||||
"resolved": "https://registry.npmjs.org/define-data-property/-/define-data-property-1.1.4.tgz",
|
||||
"integrity": "sha512-rBMvIzlpA8v6E+SJZoo++HAYqsLrkg7MSfIinMPFhmkorw7X+dOXVJQs+QT69zGkzMyfDnIMN2Wid1+NbL3T+A==",
|
||||
"dependencies": {
|
||||
"es-define-property": "^1.0.0",
|
||||
"es-errors": "^1.3.0",
|
||||
"gopd": "^1.0.1"
|
||||
},
|
||||
"engines": {
|
||||
"node": ">= 0.4"
|
||||
},
|
||||
"funding": {
|
||||
"url": "https://github.com/sponsors/ljharb"
|
||||
}
|
||||
},
|
||||
"node_modules/define-properties": {
|
||||
"version": "1.2.0",
|
||||
"dev": true,
|
||||
@ -2166,9 +2197,9 @@
|
||||
"license": "MIT"
|
||||
},
|
||||
"node_modules/encodeurl": {
|
||||
"version": "1.0.2",
|
||||
"resolved": "https://registry.npmjs.org/encodeurl/-/encodeurl-1.0.2.tgz",
|
||||
"integrity": "sha512-TPJXq8JqFaVYm2CWmPvnP2Iyo4ZSM7/QKcSmuMLDObfpH5fi7RUGmd/rTDf+rut/saiDiQEeVTNgAmJEdAOx0w==",
|
||||
"version": "2.0.0",
|
||||
"resolved": "https://registry.npmjs.org/encodeurl/-/encodeurl-2.0.0.tgz",
|
||||
"integrity": "sha512-Q0n9HRi4m6JuGIV1eFlmvJB7ZEVxu93IrMyiMsGC0lrMJMWzRgx6WGquyfQgZVb31vhGgXnfmPNNXmxnOkRBrg==",
|
||||
"engines": {
|
||||
"node": ">= 0.8"
|
||||
}
|
||||
@ -2211,6 +2242,25 @@
|
||||
"is-arrayish": "^0.2.1"
|
||||
}
|
||||
},
|
||||
"node_modules/es-define-property": {
|
||||
"version": "1.0.0",
|
||||
"resolved": "https://registry.npmjs.org/es-define-property/-/es-define-property-1.0.0.tgz",
|
||||
"integrity": "sha512-jxayLKShrEqqzJ0eumQbVhTYQM27CfT1T35+gCgDFoL82JLsXqTJ76zv6A0YLOgEnLUMvLzsDsGIrl8NFpT2gQ==",
|
||||
"dependencies": {
|
||||
"get-intrinsic": "^1.2.4"
|
||||
},
|
||||
"engines": {
|
||||
"node": ">= 0.4"
|
||||
}
|
||||
},
|
||||
"node_modules/es-errors": {
|
||||
"version": "1.3.0",
|
||||
"resolved": "https://registry.npmjs.org/es-errors/-/es-errors-1.3.0.tgz",
|
||||
"integrity": "sha512-Zf5H2Kxt2xjTvbJvP2ZWLEICxA6j+hAmMzIlypy4xcBg1vKVnx89Wy0GbS+kf5cwCVFFzdCFh2XSCFNULS6csw==",
|
||||
"engines": {
|
||||
"node": ">= 0.4"
|
||||
}
|
||||
},
|
||||
"node_modules/es6-error": {
|
||||
"version": "4.1.1",
|
||||
"dev": true,
|
||||
@ -2356,36 +2406,36 @@
|
||||
"license": "Apache-2.0"
|
||||
},
|
||||
"node_modules/express": {
|
||||
"version": "4.18.2",
|
||||
"resolved": "https://registry.npmjs.org/express/-/express-4.18.2.tgz",
|
||||
"integrity": "sha512-5/PsL6iGPdfQ/lKM1UuielYgv3BUoJfz1aUwU9vHZ+J7gyvwdQXFEBIEIaxeGf0GIcreATNyBExtalisDbuMqQ==",
|
||||
"version": "4.21.2",
|
||||
"resolved": "https://registry.npmjs.org/express/-/express-4.21.2.tgz",
|
||||
"integrity": "sha512-28HqgMZAmih1Czt9ny7qr6ek2qddF4FclbMzwhCREB6OFfH+rXAnuNCwo1/wFvrtbgsQDb4kSbX9de9lFbrXnA==",
|
||||
"dependencies": {
|
||||
"accepts": "~1.3.8",
|
||||
"array-flatten": "1.1.1",
|
||||
"body-parser": "1.20.1",
|
||||
"body-parser": "1.20.3",
|
||||
"content-disposition": "0.5.4",
|
||||
"content-type": "~1.0.4",
|
||||
"cookie": "0.5.0",
|
||||
"cookie": "0.7.1",
|
||||
"cookie-signature": "1.0.6",
|
||||
"debug": "2.6.9",
|
||||
"depd": "2.0.0",
|
||||
"encodeurl": "~1.0.2",
|
||||
"encodeurl": "~2.0.0",
|
||||
"escape-html": "~1.0.3",
|
||||
"etag": "~1.8.1",
|
||||
"finalhandler": "1.2.0",
|
||||
"finalhandler": "1.3.1",
|
||||
"fresh": "0.5.2",
|
||||
"http-errors": "2.0.0",
|
||||
"merge-descriptors": "1.0.1",
|
||||
"merge-descriptors": "1.0.3",
|
||||
"methods": "~1.1.2",
|
||||
"on-finished": "2.4.1",
|
||||
"parseurl": "~1.3.3",
|
||||
"path-to-regexp": "0.1.7",
|
||||
"path-to-regexp": "0.1.12",
|
||||
"proxy-addr": "~2.0.7",
|
||||
"qs": "6.11.0",
|
||||
"qs": "6.13.0",
|
||||
"range-parser": "~1.2.1",
|
||||
"safe-buffer": "5.2.1",
|
||||
"send": "0.18.0",
|
||||
"serve-static": "1.15.0",
|
||||
"send": "0.19.0",
|
||||
"serve-static": "1.16.2",
|
||||
"setprototypeof": "1.2.0",
|
||||
"statuses": "2.0.1",
|
||||
"type-is": "~1.6.18",
|
||||
@ -2394,6 +2444,10 @@
|
||||
},
|
||||
"engines": {
|
||||
"node": ">= 0.10.0"
|
||||
},
|
||||
"funding": {
|
||||
"type": "opencollective",
|
||||
"url": "https://opencollective.com/express"
|
||||
}
|
||||
},
|
||||
"node_modules/express/node_modules/debug": {
|
||||
@ -2515,9 +2569,10 @@
|
||||
}
|
||||
},
|
||||
"node_modules/fill-range": {
|
||||
"version": "7.0.1",
|
||||
"version": "7.1.1",
|
||||
"resolved": "https://registry.npmjs.org/fill-range/-/fill-range-7.1.1.tgz",
|
||||
"integrity": "sha512-YsGpe3WHLK8ZYi4tWDg2Jy3ebRz2rXowDxnld4bkQB00cc/1Zw9AWnC0i9ztDJitivtQvaI9KaLyKrc+hBW0yg==",
|
||||
"dev": true,
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"to-regex-range": "^5.0.1"
|
||||
},
|
||||
@ -2526,12 +2581,12 @@
|
||||
}
|
||||
},
|
||||
"node_modules/finalhandler": {
|
||||
"version": "1.2.0",
|
||||
"resolved": "https://registry.npmjs.org/finalhandler/-/finalhandler-1.2.0.tgz",
|
||||
"integrity": "sha512-5uXcUVftlQMFnWC9qu/svkWv3GTd2PfUhK/3PLkYNAe7FbqJMt3515HaxE6eRL74GdsriiwujiawdaB1BpEISg==",
|
||||
"version": "1.3.1",
|
||||
"resolved": "https://registry.npmjs.org/finalhandler/-/finalhandler-1.3.1.tgz",
|
||||
"integrity": "sha512-6BN9trH7bp3qvnrRyzsBz+g3lZxTNZTbVO2EV1CS0WIcDbawYVdYvGflME/9QP0h0pYlCDBCTjYa9nZzMDpyxQ==",
|
||||
"dependencies": {
|
||||
"debug": "2.6.9",
|
||||
"encodeurl": "~1.0.2",
|
||||
"encodeurl": "~2.0.0",
|
||||
"escape-html": "~1.0.3",
|
||||
"on-finished": "2.4.1",
|
||||
"parseurl": "~1.3.3",
|
||||
@ -2695,11 +2750,16 @@
|
||||
},
|
||||
"node_modules/fs.realpath": {
|
||||
"version": "1.0.0",
|
||||
"dev": true,
|
||||
"license": "ISC"
|
||||
},
|
||||
"node_modules/function-bind": {
|
||||
"version": "1.1.1",
|
||||
"license": "MIT"
|
||||
"version": "1.1.2",
|
||||
"resolved": "https://registry.npmjs.org/function-bind/-/function-bind-1.1.2.tgz",
|
||||
"integrity": "sha512-7XHNxH7qX9xG5mIwxkhumTox/MIRNcOgDrxWsMt2pAr23WHp6MrRlN7FBSFpCpr+oVO0F744iUgR82nJMfG2SA==",
|
||||
"funding": {
|
||||
"url": "https://github.com/sponsors/ljharb"
|
||||
}
|
||||
},
|
||||
"node_modules/galactus": {
|
||||
"version": "0.2.1",
|
||||
@ -2780,13 +2840,18 @@
|
||||
}
|
||||
},
|
||||
"node_modules/get-intrinsic": {
|
||||
"version": "1.2.1",
|
||||
"license": "MIT",
|
||||
"version": "1.2.4",
|
||||
"resolved": "https://registry.npmjs.org/get-intrinsic/-/get-intrinsic-1.2.4.tgz",
|
||||
"integrity": "sha512-5uYhsJH8VJBTv7oslg4BznJYhDoRI6waYCxMmCdnTrcCrHA/fCFKoTFz2JKKE0HdDFUF7/oQuhzumXJK7paBRQ==",
|
||||
"dependencies": {
|
||||
"function-bind": "^1.1.1",
|
||||
"has": "^1.0.3",
|
||||
"es-errors": "^1.3.0",
|
||||
"function-bind": "^1.1.2",
|
||||
"has-proto": "^1.0.1",
|
||||
"has-symbols": "^1.0.3"
|
||||
"has-symbols": "^1.0.3",
|
||||
"hasown": "^2.0.0"
|
||||
},
|
||||
"engines": {
|
||||
"node": ">= 0.4"
|
||||
},
|
||||
"funding": {
|
||||
"url": "https://github.com/sponsors/ljharb"
|
||||
@ -2835,6 +2900,7 @@
|
||||
},
|
||||
"node_modules/glob": {
|
||||
"version": "7.2.3",
|
||||
"dev": true,
|
||||
"license": "ISC",
|
||||
"dependencies": {
|
||||
"fs.realpath": "^1.0.0",
|
||||
@ -2933,6 +2999,17 @@
|
||||
"url": "https://github.com/sponsors/ljharb"
|
||||
}
|
||||
},
|
||||
"node_modules/gopd": {
|
||||
"version": "1.0.1",
|
||||
"resolved": "https://registry.npmjs.org/gopd/-/gopd-1.0.1.tgz",
|
||||
"integrity": "sha512-d65bNlIadxvpb/A2abVdlqKqV563juRnZ1Wtk6s1sIR8uNsXR70xqIzVqxVf1eTqDunwT2MkczEeaezCKTZhwA==",
|
||||
"dependencies": {
|
||||
"get-intrinsic": "^1.1.3"
|
||||
},
|
||||
"funding": {
|
||||
"url": "https://github.com/sponsors/ljharb"
|
||||
}
|
||||
},
|
||||
"node_modules/got": {
|
||||
"version": "11.8.6",
|
||||
"dev": true,
|
||||
@ -2964,6 +3041,7 @@
|
||||
},
|
||||
"node_modules/has": {
|
||||
"version": "1.0.3",
|
||||
"dev": true,
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"function-bind": "^1.1.1"
|
||||
@ -2981,12 +3059,11 @@
|
||||
}
|
||||
},
|
||||
"node_modules/has-property-descriptors": {
|
||||
"version": "1.0.0",
|
||||
"dev": true,
|
||||
"license": "MIT",
|
||||
"optional": true,
|
||||
"version": "1.0.2",
|
||||
"resolved": "https://registry.npmjs.org/has-property-descriptors/-/has-property-descriptors-1.0.2.tgz",
|
||||
"integrity": "sha512-55JNKuIW+vq4Ke1BjOTjM2YctQIvCT7GFzHwmfZPGo5wnrgkid0YQtnAleFSqumZm4az3n2BS+erby5ipJdgrg==",
|
||||
"dependencies": {
|
||||
"get-intrinsic": "^1.1.1"
|
||||
"es-define-property": "^1.0.0"
|
||||
},
|
||||
"funding": {
|
||||
"url": "https://github.com/sponsors/ljharb"
|
||||
@ -3017,6 +3094,17 @@
|
||||
"dev": true,
|
||||
"license": "ISC"
|
||||
},
|
||||
"node_modules/hasown": {
|
||||
"version": "2.0.2",
|
||||
"resolved": "https://registry.npmjs.org/hasown/-/hasown-2.0.2.tgz",
|
||||
"integrity": "sha512-0hJU9SCPvmMzIBdZFqNPXWa6dqh7WdH0cII9y+CyS8rG3nL48Bclra9HmKhVVUHyPWNH5Y7xDwAB7bfgSjkUMQ==",
|
||||
"dependencies": {
|
||||
"function-bind": "^1.1.2"
|
||||
},
|
||||
"engines": {
|
||||
"node": ">= 0.4"
|
||||
}
|
||||
},
|
||||
"node_modules/hexoid": {
|
||||
"version": "1.0.0",
|
||||
"resolved": "https://registry.npmjs.org/hexoid/-/hexoid-1.0.0.tgz",
|
||||
@ -3162,6 +3250,7 @@
|
||||
},
|
||||
"node_modules/inflight": {
|
||||
"version": "1.0.6",
|
||||
"dev": true,
|
||||
"license": "ISC",
|
||||
"dependencies": {
|
||||
"once": "^1.3.0",
|
||||
@ -3186,9 +3275,10 @@
|
||||
}
|
||||
},
|
||||
"node_modules/ip": {
|
||||
"version": "2.0.0",
|
||||
"dev": true,
|
||||
"license": "MIT"
|
||||
"version": "2.0.1",
|
||||
"resolved": "https://registry.npmjs.org/ip/-/ip-2.0.1.tgz",
|
||||
"integrity": "sha512-lJUL9imLTNi1ZfXT+DU6rBBdbiKGBuay9B6xGSPVjUeQwaH1RIGqef8RZkUtHioLmSNpPR5M4HVKJGm1j8FWVQ==",
|
||||
"dev": true
|
||||
},
|
||||
"node_modules/ipaddr.js": {
|
||||
"version": "1.9.1",
|
||||
@ -3270,8 +3360,9 @@
|
||||
},
|
||||
"node_modules/is-number": {
|
||||
"version": "7.0.0",
|
||||
"resolved": "https://registry.npmjs.org/is-number/-/is-number-7.0.0.tgz",
|
||||
"integrity": "sha512-41Cifkg6e8TylSpdtTpeLVMqvSBEVzTttHvERD741+pnZ8ANv0004MRL43QKPDlK9cGvNp6NZWZUBlbGXYxxng==",
|
||||
"dev": true,
|
||||
"license": "MIT",
|
||||
"engines": {
|
||||
"node": ">=0.12.0"
|
||||
}
|
||||
@ -3640,9 +3731,12 @@
|
||||
}
|
||||
},
|
||||
"node_modules/merge-descriptors": {
|
||||
"version": "1.0.1",
|
||||
"resolved": "https://registry.npmjs.org/merge-descriptors/-/merge-descriptors-1.0.1.tgz",
|
||||
"integrity": "sha512-cCi6g3/Zr1iqQi6ySbseM1Xvooa98N0w31jzUYrXPX2xqObmFGHJ0tQ5u74H3mVh7wLouTseZyYIq39g8cNp1w=="
|
||||
"version": "1.0.3",
|
||||
"resolved": "https://registry.npmjs.org/merge-descriptors/-/merge-descriptors-1.0.3.tgz",
|
||||
"integrity": "sha512-gaNvAS7TZ897/rVaZ0nMtAyxNyi/pdbjbAwUpFQpN70GqnVfOiXpeUUMKRBmzXaSQ8DdTX4/0ms62r2K+hE6mQ==",
|
||||
"funding": {
|
||||
"url": "https://github.com/sponsors/sindresorhus"
|
||||
}
|
||||
},
|
||||
"node_modules/merge2": {
|
||||
"version": "1.4.1",
|
||||
@ -3720,6 +3814,7 @@
|
||||
},
|
||||
"node_modules/minimatch": {
|
||||
"version": "3.1.2",
|
||||
"dev": true,
|
||||
"license": "ISC",
|
||||
"dependencies": {
|
||||
"brace-expansion": "^1.1.7"
|
||||
@ -4086,9 +4181,12 @@
|
||||
}
|
||||
},
|
||||
"node_modules/object-inspect": {
|
||||
"version": "1.12.3",
|
||||
"resolved": "https://registry.npmjs.org/object-inspect/-/object-inspect-1.12.3.tgz",
|
||||
"integrity": "sha512-geUvdk7c+eizMNUDkRpW1wJwgfOiOeHbxBR/hLXK1aT6zmVSO0jsQcs7fj6MGw89jC/cjGfLcNOrtMYtGqm81g==",
|
||||
"version": "1.13.2",
|
||||
"resolved": "https://registry.npmjs.org/object-inspect/-/object-inspect-1.13.2.tgz",
|
||||
"integrity": "sha512-IRZSRuzJiynemAXPYtPe5BoI/RESNYR7TYm50MC5Mqbd3Jmw5y790sErYw3V6SryFJD64b74qQQs9wn5Bg/k3g==",
|
||||
"engines": {
|
||||
"node": ">= 0.4"
|
||||
},
|
||||
"funding": {
|
||||
"url": "https://github.com/sponsors/ljharb"
|
||||
}
|
||||
@ -4284,6 +4382,7 @@
|
||||
},
|
||||
"node_modules/path-is-absolute": {
|
||||
"version": "1.0.1",
|
||||
"dev": true,
|
||||
"license": "MIT",
|
||||
"engines": {
|
||||
"node": ">=0.10.0"
|
||||
@ -4326,9 +4425,9 @@
|
||||
}
|
||||
},
|
||||
"node_modules/path-to-regexp": {
|
||||
"version": "0.1.7",
|
||||
"resolved": "https://registry.npmjs.org/path-to-regexp/-/path-to-regexp-0.1.7.tgz",
|
||||
"integrity": "sha512-5DFkuoqlv1uYQKxy8omFBeJPQcdoE07Kv2sferDCrAq1ohOU+MSDswDIbnx3YAM60qIOnYa53wBhXW0EbMonrQ=="
|
||||
"version": "0.1.12",
|
||||
"resolved": "https://registry.npmjs.org/path-to-regexp/-/path-to-regexp-0.1.12.tgz",
|
||||
"integrity": "sha512-RA1GjUVMnvYFxuqovrEqZoxxW5NUZqbwKtYz/Tt7nXerk0LbLblQmrsgdeOxV5SFHf0UDggjS/bSeOZwt1pmEQ=="
|
||||
},
|
||||
"node_modules/path-type": {
|
||||
"version": "2.0.0",
|
||||
@ -4498,11 +4597,11 @@
|
||||
}
|
||||
},
|
||||
"node_modules/qs": {
|
||||
"version": "6.11.0",
|
||||
"resolved": "https://registry.npmjs.org/qs/-/qs-6.11.0.tgz",
|
||||
"integrity": "sha512-MvjoMCJwEarSbUYk5O+nmoSzSutSsTwF85zcHPQ9OrlFoZOYIjaqBAJIqIXjptyD5vThxGq52Xu/MaJzRkIk4Q==",
|
||||
"version": "6.13.0",
|
||||
"resolved": "https://registry.npmjs.org/qs/-/qs-6.13.0.tgz",
|
||||
"integrity": "sha512-+38qI9SOr8tfZ4QmJNplMUxqjbe7LKvvZgWdExBOmd+egZTtjLB67Gu0HRX3u/XOq7UU2Nx6nsjvS16Z9uwfpg==",
|
||||
"dependencies": {
|
||||
"side-channel": "^1.0.4"
|
||||
"side-channel": "^1.0.6"
|
||||
},
|
||||
"engines": {
|
||||
"node": ">=0.6"
|
||||
@ -4550,9 +4649,9 @@
|
||||
}
|
||||
},
|
||||
"node_modules/raw-body": {
|
||||
"version": "2.5.1",
|
||||
"resolved": "https://registry.npmjs.org/raw-body/-/raw-body-2.5.1.tgz",
|
||||
"integrity": "sha512-qqJBtEyVgS0ZmPGdCFPWJ3FreoqvG4MVQln/kCgF7Olq95IbOp0/BWyMwbdtn4VTvkM8Y7khCQ2Xgk/tcrCXig==",
|
||||
"version": "2.5.2",
|
||||
"resolved": "https://registry.npmjs.org/raw-body/-/raw-body-2.5.2.tgz",
|
||||
"integrity": "sha512-8zGqypfENjCIqGhgXToC8aB2r7YrBX+AQAfIPs/Mlk+BtPTztOvTS01NRW/3Eh60J+a48lt8qsCzirQ6loCVfA==",
|
||||
"dependencies": {
|
||||
"bytes": "3.1.2",
|
||||
"http-errors": "2.0.0",
|
||||
@ -4734,6 +4833,7 @@
|
||||
},
|
||||
"node_modules/rimraf": {
|
||||
"version": "3.0.2",
|
||||
"dev": true,
|
||||
"license": "ISC",
|
||||
"dependencies": {
|
||||
"glob": "^7.1.3"
|
||||
@ -4801,16 +4901,27 @@
|
||||
"license": "MIT"
|
||||
},
|
||||
"node_modules/selenium-webdriver": {
|
||||
"version": "4.16.0",
|
||||
"resolved": "https://registry.npmjs.org/selenium-webdriver/-/selenium-webdriver-4.16.0.tgz",
|
||||
"integrity": "sha512-IbqpRpfGE7JDGgXHJeWuCqT/tUqnLvZ14csSwt+S8o4nJo3RtQoE9VR4jB47tP/A8ArkYsh/THuMY6kyRP6kuA==",
|
||||
"version": "4.27.0",
|
||||
"resolved": "https://registry.npmjs.org/selenium-webdriver/-/selenium-webdriver-4.27.0.tgz",
|
||||
"integrity": "sha512-LkTJrNz5socxpPnWPODQ2bQ65eYx9JK+DQMYNihpTjMCqHwgWGYQnQTCAAche2W3ZP87alA+1zYPvgS8tHNzMQ==",
|
||||
"funding": [
|
||||
{
|
||||
"type": "github",
|
||||
"url": "https://github.com/sponsors/SeleniumHQ"
|
||||
},
|
||||
{
|
||||
"type": "opencollective",
|
||||
"url": "https://opencollective.com/selenium"
|
||||
}
|
||||
],
|
||||
"dependencies": {
|
||||
"@bazel/runfiles": "^6.3.1",
|
||||
"jszip": "^3.10.1",
|
||||
"tmp": "^0.2.1",
|
||||
"ws": ">=8.14.2"
|
||||
"tmp": "^0.2.3",
|
||||
"ws": "^8.18.0"
|
||||
},
|
||||
"engines": {
|
||||
"node": ">= 14.20.0"
|
||||
"node": ">= 14.21.0"
|
||||
}
|
||||
},
|
||||
"node_modules/semver": {
|
||||
@ -4843,9 +4954,9 @@
|
||||
}
|
||||
},
|
||||
"node_modules/send": {
|
||||
"version": "0.18.0",
|
||||
"resolved": "https://registry.npmjs.org/send/-/send-0.18.0.tgz",
|
||||
"integrity": "sha512-qqWzuOjSFOuqPjFe4NOsMLafToQQwBSOEpS+FwEt3A2V3vKubTquT3vmLTQpFgMXp8AlFWFuP1qKaJZOtPpVXg==",
|
||||
"version": "0.19.0",
|
||||
"resolved": "https://registry.npmjs.org/send/-/send-0.19.0.tgz",
|
||||
"integrity": "sha512-dW41u5VfLXu8SJh5bwRmyYUbAoSB3c9uQh6L8h/KtsFREPWpbX1lrljJo186Jc4nmci/sGUZ9a0a0J2zgfq2hw==",
|
||||
"dependencies": {
|
||||
"debug": "2.6.9",
|
||||
"depd": "2.0.0",
|
||||
@ -4878,6 +4989,14 @@
|
||||
"resolved": "https://registry.npmjs.org/ms/-/ms-2.0.0.tgz",
|
||||
"integrity": "sha512-Tpp60P6IUJDTuOq/5Z8cdskzJujfwqfOTkrwIwj7IRISpnkJnT6SyJ4PCPnGMoFjC9ddhal5KVIYtAt97ix05A=="
|
||||
},
|
||||
"node_modules/send/node_modules/encodeurl": {
|
||||
"version": "1.0.2",
|
||||
"resolved": "https://registry.npmjs.org/encodeurl/-/encodeurl-1.0.2.tgz",
|
||||
"integrity": "sha512-TPJXq8JqFaVYm2CWmPvnP2Iyo4ZSM7/QKcSmuMLDObfpH5fi7RUGmd/rTDf+rut/saiDiQEeVTNgAmJEdAOx0w==",
|
||||
"engines": {
|
||||
"node": ">= 0.8"
|
||||
}
|
||||
},
|
||||
"node_modules/send/node_modules/ms": {
|
||||
"version": "2.1.3",
|
||||
"resolved": "https://registry.npmjs.org/ms/-/ms-2.1.3.tgz",
|
||||
@ -4911,14 +5030,14 @@
|
||||
}
|
||||
},
|
||||
"node_modules/serve-static": {
|
||||
"version": "1.15.0",
|
||||
"resolved": "https://registry.npmjs.org/serve-static/-/serve-static-1.15.0.tgz",
|
||||
"integrity": "sha512-XGuRDNjXUijsUL0vl6nSD7cwURuzEgglbOaFuZM9g3kwDXOWVTck0jLzjPzGD+TazWbboZYu52/9/XPdUgne9g==",
|
||||
"version": "1.16.2",
|
||||
"resolved": "https://registry.npmjs.org/serve-static/-/serve-static-1.16.2.tgz",
|
||||
"integrity": "sha512-VqpjJZKadQB/PEbEwvFdO43Ax5dFBZ2UECszz8bQ7pi7wt//PWe1P6MN7eCnjsatYtBT6EuiClbjSWP2WrIoTw==",
|
||||
"dependencies": {
|
||||
"encodeurl": "~1.0.2",
|
||||
"encodeurl": "~2.0.0",
|
||||
"escape-html": "~1.0.3",
|
||||
"parseurl": "~1.3.3",
|
||||
"send": "0.18.0"
|
||||
"send": "0.19.0"
|
||||
},
|
||||
"engines": {
|
||||
"node": ">= 0.8.0"
|
||||
@ -4929,6 +5048,22 @@
|
||||
"dev": true,
|
||||
"license": "ISC"
|
||||
},
|
||||
"node_modules/set-function-length": {
|
||||
"version": "1.2.2",
|
||||
"resolved": "https://registry.npmjs.org/set-function-length/-/set-function-length-1.2.2.tgz",
|
||||
"integrity": "sha512-pgRc4hJ4/sNjWCSS9AmnS40x3bNMDTknHgL5UaMBTMyJnU90EgWh1Rz+MC9eFu4BuN/UwZjKQuY/1v3rM7HMfg==",
|
||||
"dependencies": {
|
||||
"define-data-property": "^1.1.4",
|
||||
"es-errors": "^1.3.0",
|
||||
"function-bind": "^1.1.2",
|
||||
"get-intrinsic": "^1.2.4",
|
||||
"gopd": "^1.0.1",
|
||||
"has-property-descriptors": "^1.0.2"
|
||||
},
|
||||
"engines": {
|
||||
"node": ">= 0.4"
|
||||
}
|
||||
},
|
||||
"node_modules/setimmediate": {
|
||||
"version": "1.0.5",
|
||||
"license": "MIT"
|
||||
@ -4958,13 +5093,17 @@
|
||||
}
|
||||
},
|
||||
"node_modules/side-channel": {
|
||||
"version": "1.0.4",
|
||||
"resolved": "https://registry.npmjs.org/side-channel/-/side-channel-1.0.4.tgz",
|
||||
"integrity": "sha512-q5XPytqFEIKHkGdiMIrY10mvLRvnQh42/+GoBlFW3b2LXLE2xxJpZFdm94we0BaoV3RwJyGqg5wS7epxTv0Zvw==",
|
||||
"version": "1.0.6",
|
||||
"resolved": "https://registry.npmjs.org/side-channel/-/side-channel-1.0.6.tgz",
|
||||
"integrity": "sha512-fDW/EZ6Q9RiO8eFG8Hj+7u/oW+XrPTIChwCOM2+th2A6OblDtYYIpve9m+KvI9Z4C9qSEXlaGR6bTEYHReuglA==",
|
||||
"dependencies": {
|
||||
"call-bind": "^1.0.0",
|
||||
"get-intrinsic": "^1.0.2",
|
||||
"object-inspect": "^1.9.0"
|
||||
"call-bind": "^1.0.7",
|
||||
"es-errors": "^1.3.0",
|
||||
"get-intrinsic": "^1.2.4",
|
||||
"object-inspect": "^1.13.1"
|
||||
},
|
||||
"engines": {
|
||||
"node": ">= 0.4"
|
||||
},
|
||||
"funding": {
|
||||
"url": "https://github.com/sponsors/ljharb"
|
||||
@ -5321,13 +5460,11 @@
|
||||
"license": "MIT"
|
||||
},
|
||||
"node_modules/tmp": {
|
||||
"version": "0.2.1",
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"rimraf": "^3.0.0"
|
||||
},
|
||||
"version": "0.2.3",
|
||||
"resolved": "https://registry.npmjs.org/tmp/-/tmp-0.2.3.tgz",
|
||||
"integrity": "sha512-nZD7m9iCPC5g0pYmcaxogYKggSfLsdxl8of3Q/oIbqCqLLIO9IAF0GWjX1z9NZRHPiXv8Wex4yDCaZsgEw0Y8w==",
|
||||
"engines": {
|
||||
"node": ">=8.17.0"
|
||||
"node": ">=14.14"
|
||||
}
|
||||
},
|
||||
"node_modules/tmp-promise": {
|
||||
@ -5341,8 +5478,9 @@
|
||||
},
|
||||
"node_modules/to-regex-range": {
|
||||
"version": "5.0.1",
|
||||
"resolved": "https://registry.npmjs.org/to-regex-range/-/to-regex-range-5.0.1.tgz",
|
||||
"integrity": "sha512-65P7iz6X5yEr1cwcgvQxbbIw7Uk3gOy5dIdtZ4rDveLqhrdJP+Li/Hx6tyK0NEb+2GCyneCMJiGqrADCSNk8sQ==",
|
||||
"dev": true,
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"is-number": "^7.0.0"
|
||||
},
|
||||
@ -5600,9 +5738,9 @@
|
||||
"license": "ISC"
|
||||
},
|
||||
"node_modules/ws": {
|
||||
"version": "8.14.2",
|
||||
"resolved": "https://registry.npmjs.org/ws/-/ws-8.14.2.tgz",
|
||||
"integrity": "sha512-wEBG1ftX4jcglPxgFCMJmZ2PLtSbJ2Peg6TmpJFTbe9GZYOQCDPdMYu/Tm0/bGZkw8paZnJY45J4K2PZrLYq8g==",
|
||||
"version": "8.18.0",
|
||||
"resolved": "https://registry.npmjs.org/ws/-/ws-8.18.0.tgz",
|
||||
"integrity": "sha512-8VbfWfHLbbwu3+N6OKsOMpBdT4kXPDDB9cJk2bJ6mh9ucxdlnNvH1e+roYkKmN9Nxw2yjz7VzeO9oOz2zJ04Pw==",
|
||||
"engines": {
|
||||
"node": ">=10.0.0"
|
||||
},
|
||||
|
@ -1,7 +1,7 @@
|
||||
{
|
||||
"name": "easy-spider",
|
||||
"productName": "EasySpider",
|
||||
"version": "0.6.0",
|
||||
"version": "0.6.3",
|
||||
"icon": "./favicon",
|
||||
"description": "NoCode Visual Web Crawler",
|
||||
"main": "main.js",
|
||||
@ -33,14 +33,14 @@
|
||||
"dependencies": {
|
||||
"cors": "^2.8.5",
|
||||
"electron-squirrel-startup": "^1.0.0",
|
||||
"express": "^4.18.2",
|
||||
"express": "^4.21.2",
|
||||
"formidable": "^3.5.0",
|
||||
"http": "^0.0.1-security",
|
||||
"multer": "^1.4.5-lts.1",
|
||||
"node-abi": "^3.52.0",
|
||||
"node-window-manager": "^2.2.4",
|
||||
"selenium-webdriver": "^4.16.0",
|
||||
"ws": "^8.12.0",
|
||||
"selenium-webdriver": "^4.27.0",
|
||||
"ws": "^8.18.0",
|
||||
"xlsx": "^0.18.5"
|
||||
},
|
||||
"config": {
|
||||
@ -67,7 +67,7 @@
|
||||
],
|
||||
"packagerConfig": {
|
||||
"icon": "./favicon",
|
||||
"appVersion": "0.6.0",
|
||||
"appVersion": "0.6.3",
|
||||
"name": "EasySpider",
|
||||
"executableName": "EasySpider",
|
||||
"appCopyright": "Naibo Wang (naibowang@foxmail.com)",
|
||||
@ -80,4 +80,4 @@
|
||||
"publishers": []
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
@ -20,9 +20,10 @@ rm out/EasySpider/resources/app/vs_BuildTools.exe
|
||||
mv out/EasySpider ../.temp_to_pub/EasySpider_Linux_x64/EasySpider
|
||||
rm -rf ../.temp_to_pub/EasySpider_Linux_x64/Code
|
||||
mkdir ../.temp_to_pub/EasySpider_Linux_x64/Code
|
||||
cp ../ExecuteStage/easyspider_executestage.py ../.temp_to_pub/EasySpider_Linux_x64/Code
|
||||
cp ../ExecuteStage/myChrome.py ../.temp_to_pub/EasySpider_Linux_x64/Code
|
||||
cp ../ExecuteStage/utils.py ../.temp_to_pub/EasySpider_Linux_x64/Code
|
||||
# cp ../ExecuteStage/easyspider_executestage.py ../.temp_to_pub/EasySpider_Linux_x64/Code
|
||||
# cp ../ExecuteStage/myChrome.py ../.temp_to_pub/EasySpider_Linux_x64/Code
|
||||
# cp ../ExecuteStage/utils.py ../.temp_to_pub/EasySpider_Linux_x64/Code
|
||||
cp ../ExecuteStage/*.py ../.temp_to_pub/EasySpider_Linux_x64/Code
|
||||
cp ../ExecuteStage/requirements.txt ../.temp_to_pub/EasySpider_Linux_x64/Code
|
||||
cp ../ExecuteStage/Readme.md ../.temp_to_pub/EasySpider_Linux_x64/Code
|
||||
cp ../ExecuteStage/myCode.py ../.temp_to_pub/EasySpider_Linux_x64
|
||||
|
@ -20,9 +20,10 @@ rm -r ../.temp_to_pub/EasySpider_MacOS/EasySpider.app/Contents/Resources/app/use
|
||||
rm -r ../.temp_to_pub/EasySpider_MacOS/EasySpider.app/Contents/Resources/app/TempUserDataFolder
|
||||
rm -rf ../.temp_to_pub/EasySpider_MacOS/Code
|
||||
mkdir ../.temp_to_pub/EasySpider_MacOS/Code
|
||||
cp ../ExecuteStage/easyspider_executestage.py ../.temp_to_pub/EasySpider_MacOS/Code
|
||||
cp ../ExecuteStage/myChrome.py ../.temp_to_pub/EasySpider_MacOS/Code
|
||||
cp ../ExecuteStage/utils.py ../.temp_to_pub/EasySpider_MacOS/Code
|
||||
# cp ../ExecuteStage/easyspider_executestage.py ../.temp_to_pub/EasySpider_MacOS/Code
|
||||
# cp ../ExecuteStage/myChrome.py ../.temp_to_pub/EasySpider_MacOS/Code
|
||||
# cp ../ExecuteStage/utils.py ../.temp_to_pub/EasySpider_MacOS/Code
|
||||
cp ../ExecuteStage/*.py ../.temp_to_pub/EasySpider_MacOS/Code
|
||||
cp ../ExecuteStage/requirements.txt ../.temp_to_pub/EasySpider_MacOS/Code
|
||||
cp ../ExecuteStage/Readme.md ../.temp_to_pub/EasySpider_MacOS/Code
|
||||
cp ../ExecuteStage/myCode.py ../.temp_to_pub/EasySpider_MacOS
|
||||
|
@ -66,6 +66,7 @@ if (!fs.existsSync(path.join(getDir(), "config.json"))) {
|
||||
webserver_port: 8074,
|
||||
user_data_folder: "./user_data",
|
||||
debug: false,
|
||||
lang: "-",
|
||||
copyright: 0,
|
||||
sys_arch: require("os").arch(),
|
||||
mysql_config_path: "./mysql_config.json",
|
||||
@ -121,6 +122,12 @@ exports.start = function (port = 8074) {
|
||||
res.setHeader("Access-Control-Allow-Origin", "*"); // 设置可访问的源
|
||||
// 解析参数
|
||||
const pathName = url.parse(req.url).pathname;
|
||||
const safeBase = path.join(__dirname, "src");
|
||||
|
||||
const safeJoin = (base, target) => {
|
||||
const targetPath = "." + path.posix.normalize("/" + target);
|
||||
return path.join(base, targetPath);
|
||||
};
|
||||
if (pathName == "/excelUpload" && req.method.toLowerCase() === "post") {
|
||||
// // parse a file upload
|
||||
// let form = new formidable.IncomingForm();
|
||||
@ -160,8 +167,16 @@ exports.start = function (port = 8074) {
|
||||
else {
|
||||
//如果有后缀名, 则为前端请求
|
||||
// console.log(path.join(__dirname,"src/taskGrid", pathName));
|
||||
const filePath = safeJoin(safeBase, pathName);
|
||||
|
||||
if (!filePath.startsWith(safeBase)) {
|
||||
res.writeHead(400, { "Content-Type": 'text/html;charset="utf-8"' });
|
||||
res.end("Invalid path");
|
||||
return;
|
||||
}
|
||||
|
||||
fs.readFile(
|
||||
path.join(__dirname, "src", pathName),
|
||||
filePath,
|
||||
async (err, data) => {
|
||||
if (err) {
|
||||
res.writeHead(404, {
|
||||
@ -200,8 +215,10 @@ exports.start = function (port = 8074) {
|
||||
let item = {
|
||||
id: task.id,
|
||||
name: task.name,
|
||||
url: task.url,
|
||||
url: task.links.split("\n")[0],
|
||||
mtime: stat.mtime,
|
||||
links: task.links,
|
||||
desc: task.desc,
|
||||
};
|
||||
if (item.id != -2) {
|
||||
output.push(item);
|
||||
@ -443,6 +460,10 @@ exports.start = function (port = 8074) {
|
||||
"utf8"
|
||||
);
|
||||
config_file = JSON.parse(config_file);
|
||||
let lang = config_file["lang"];
|
||||
if(lang == undefined){
|
||||
lang = "-";
|
||||
}
|
||||
res.write(JSON.stringify(config_file));
|
||||
res.end();
|
||||
} else if (pathName == "/setUserDataFolder") {
|
||||
|
@ -32,7 +32,7 @@
|
||||
<body>
|
||||
<div id="app">
|
||||
|
||||
<div style="padding: 10px; text-align: center;vertical-align: middle;" v-if="init">
|
||||
<div style="padding: 10px; text-align: center;vertical-align: middle;" v-if="lang=='-'">
|
||||
<h5 style="margin-top: 20px">选择语言/Select Language</h5>
|
||||
|
||||
<p><a @click="changeLang('zh')" class="btn btn-outline-primary btn-lg"
|
||||
@ -40,15 +40,15 @@
|
||||
|
||||
<p><a @click="changeLang('en')" class="btn btn-outline-primary btn-lg"
|
||||
style="margin-top: 15px; width: 300px;height:60px;padding-top:12px;">English</a></p>
|
||||
<p style="font-size: 17px">当前版本/Current Version: <b>v0.6.0</b></p>
|
||||
<p style="font-size: 17px"><a href="https://github.com/NaiboWang/EasySpider/releases" target="_blank">Github</a>最新版本/Newest Version:<b>{{newest_version}}</b></p>
|
||||
<!-- <p>如发现新版本更新,可从以下Github仓库下载最新版本使用/If a new version is found, you can download the latest version from the following Github repository:</p>-->
|
||||
<!-- <p></p>-->
|
||||
<!-- <p>如发现新版本更新,可从以下Github仓库下载最新版本使用/If a new version is found, you can download the latest version from the following Github repository:</p>-->
|
||||
<!-- <p></p>-->
|
||||
<div class="img-container">
|
||||
<!-- <h5>出品方/Producer</h5>-->
|
||||
<!-- <h5>出品方/Producer</h5>-->
|
||||
<a href="https://www.zju.edu.cn" alt="浙江大学 Zhejiang University" target="_blank"><img src="img/zju.png"></a>
|
||||
<a href="https://www.nus.edu.sg" alt="新加坡国立大学 National University of Singpaore" target="_blank"><img src="img/nuslogo.png"></a>
|
||||
<a href="https://www.xidian.edu.cn" alt="西安电子科技大学 Xidian University" target="_blank"><img src="img/xidian.png"></a>
|
||||
<a href="https://www.nus.edu.sg" alt="新加坡国立大学 National University of Singpaore" target="_blank"><img
|
||||
src="img/nuslogo.png"></a>
|
||||
<a href="https://www.xidian.edu.cn" alt="西安电子科技大学 Xidian University" target="_blank"><img
|
||||
src="img/xidian.png"></a>
|
||||
</div>
|
||||
|
||||
</div>
|
||||
@ -58,9 +58,11 @@
|
||||
<div v-if="step == -1">
|
||||
<h4 style="margin-top: 20px">Copyright and Disclaimer</h4>
|
||||
<p>Please carefully read the following instructions regarding the use of the software and commercial payments. If you agree, please accept the agreement.</p>
|
||||
<textarea class="form-control" style="margin:0 auto;width:90%; color:black; height: 450px; min-height: 200px; background: white" readonly>
|
||||
This software is intended for educational and communication purposes only. It is strictly prohibited to use the software for any illegal activities or operations, such as crawling government/military websites that are not allowed to be crawled. The user bears all consequences resulting from the use of this software and the author shall not be held responsible or liable in any way. Furthermore, the software is protected by patent rights. If you intend to use it for commercial purposes or profit-making activities, such as using the software for client orders, selling the collected data, please contact author: naibowang@foxmail.com for patent authorization and payment operations: https://www.patentguru.com/cn/search?q=一种自定义提取流程的服务封装系统
|
||||
For individual users, EasySpider is a completely free and ad-free open-source software. The development and maintenance of the software rely solely on the author's voluntary efforts. Therefore, you can choose to support the author, allowing them to have more enthusiasm and energy to maintain this software. Alternatively, if you have profited from using this software, you are welcome to support the author through the following methods:
|
||||
<textarea class="form-control"
|
||||
style="margin:0 auto;width:90%; color:black; height: 450px; min-height: 200px; background: white"
|
||||
readonly>
|
||||
This software is intended for educational and communication purposes only. It is strictly prohibited to use the software for any illegal activities or operations, such as crawling government/military websites that are not allowed to be crawled. The user bears all consequences resulting from the use of this software and the author shall not be held responsible or liable in any way.
|
||||
EasySpider is a completely free and ad-free open-source software. The development and maintenance of the software rely solely on the author's voluntary efforts. Therefore, you can choose to support the author, allowing them to have more enthusiasm and energy to maintain this software. Alternatively, if you have profited from using this software, you are welcome to support the author through the following methods:
|
||||
|
||||
1. PayPal account: naibowang, or scan the QR code provided in the software package.
|
||||
2. Alipay account: naibowang@foxmail.com, or scan the QR code provided in the software package.
|
||||
@ -68,7 +70,8 @@ For individual users, EasySpider is a completely free and ad-free open-source so
|
||||
|
||||
</textarea>
|
||||
<p><a @click="acceptAgreement" class="btn btn-primary btn-lg"
|
||||
style="margin-top: 30px; width: 300px;height:60px;padding-top:12px;color:white">Agree and Start</a></p>
|
||||
style="margin-top: 30px; width: 300px;height:60px;padding-top:12px;color:white">Agree and Start</a>
|
||||
</p>
|
||||
</div>
|
||||
<div v-if="step == 0">
|
||||
<p style="margin-top: 20px">Hint: Click Button below to start.</p>
|
||||
@ -83,13 +86,20 @@ For individual users, EasySpider is a completely free and ad-free open-source so
|
||||
style="margin-top: 15px; width: 300px;height:60px;padding-top:12px;color:white">View/Manage/Execute
|
||||
Tasks</a></p>
|
||||
<p>
|
||||
<a href="https://www.easyspider.cn/index_english.html" target="_blank" style="text-align: center; font-size: 18px">Browse official website to watch tutorials</a>
|
||||
<a href="https://www.easyspider.cn/index_english.html" target="_blank"
|
||||
style="text-align: center; font-size: 18px">Browse official website to watch tutorials</a>
|
||||
</p>
|
||||
<p style="font-size: 17px">Current Version: <b>v0.6.3</b></p>
|
||||
<p style="font-size: 17px"><a href="https://github.com/NaiboWang/EasySpider/releases"
|
||||
target="_blank">Newest</a> Version: <b>{{newest_version}}</b></p>
|
||||
<div class="img-container">
|
||||
<!-- <h5>Producer</h5>-->
|
||||
<a href="https://www.zju.edu.cn" alt="Zhejiang University" target="_blank"><img src="img/zju.png"></a>
|
||||
<a href="https://www.nus.edu.sg" alt="National University of Singapore" target="_blank"><img src="img/nuslogo.png"></a>
|
||||
<a href="https://www.xidian.edu.cn" alt="Xidian University" target="_blank"><img src="img/xidian.png"></a>
|
||||
<!-- <h5>Producer</h5>-->
|
||||
<a href="https://www.zju.edu.cn" alt="Zhejiang University" target="_blank"><img
|
||||
src="img/zju.png"></a>
|
||||
<a href="https://www.nus.edu.sg" alt="National University of Singapore" target="_blank"><img
|
||||
src="img/nuslogo.png"></a>
|
||||
<a href="https://www.xidian.edu.cn" alt="Xidian University" target="_blank"><img
|
||||
src="img/xidian.png"></a>
|
||||
</div>
|
||||
</div>
|
||||
<div v-else-if="step == 1">
|
||||
@ -113,7 +123,8 @@ For individual users, EasySpider is a completely free and ad-free open-source so
|
||||
style="margin-top: 15px; width: 300px;height:60px;padding-top:12px;color:white">Start Data Mode</a>
|
||||
</p>
|
||||
|
||||
<a @click="step = 0" class="btn btn-outline-primary btn-lg"style="margin-top: 10px; width: 302px;height:45px;padding-top:5px">Go to Home Page</a>
|
||||
<a @click="step = 0" class="btn btn-outline-primary btn-lg"
|
||||
style="margin-top: 10px; width: 302px;height:45px;padding-top:5px">Go to Home Page</a>
|
||||
|
||||
</div>
|
||||
<div v-else-if="step == 2">
|
||||
@ -121,7 +132,11 @@ For individual users, EasySpider is a completely free and ad-free open-source so
|
||||
<div style="margin: 0 auto; width:90%">
|
||||
<p style="margin-top: 20px; text-align: justify">
|
||||
Please specify the directory of user data below. Once set, the browser will load cookies and other contents such as user login information from this directory. The browser will load data from this directory every time it is designed and executed, as long as the directory remains the same. </p>
|
||||
<p style="text-align: justify">For example, if the <b>./user_data</b> folder is set and you log in at <b>ebay.com</b> during the design process, then the previous login status will still be retained when you specify the <b>./user_data</b> folder again for the next design or task execution when you open <b>ebay.com</b>.</p>
|
||||
<p style="text-align: justify">For example, if the
|
||||
<b>./user_data</b> folder is set and you log in at
|
||||
<b>ebay.com</b> during the design process, then the previous login status will still be retained when you specify the
|
||||
<b>./user_data</b> folder again for the next design or task execution when you open
|
||||
<b>ebay.com</b>.</p>
|
||||
<p style="text-align: justify">If there are multiple configurations, different directories can be set for each configuration. Each directory will be treated as a separate configuration set, and if a directory does not exist, it will be created automatically.</p>
|
||||
<p><textarea class="form-control" style="min-height: 50px;"
|
||||
v-model="user_data_folder"></textarea>
|
||||
@ -129,13 +144,16 @@ For individual users, EasySpider is a completely free and ad-free open-source so
|
||||
</div>
|
||||
<p><a @click="startDesign('en', true)"
|
||||
class="btn btn-primary btn-lg"
|
||||
style="margin-top: 15px; width: 300px;height:60px;padding-top:12px;color:white">Start Design</a></p>
|
||||
style="margin-top: 15px; width: 300px;height:60px;padding-top:12px;color:white">Start Design</a>
|
||||
</p>
|
||||
<p>
|
||||
<p><a @click="startDesign('en', true, true)"
|
||||
class="btn btn-primary btn-lg"
|
||||
style="margin-top: 15px; width: 300px;height:60px;padding-top:12px;color:white">Start Design (Mobile)</a></p>
|
||||
style="margin-top: 15px; width: 300px;height:60px;padding-top:12px;color:white">Start Design (Mobile)</a>
|
||||
</p>
|
||||
<p>
|
||||
<a @click="step = 0" class="btn btn-outline-primary btn-lg"style="margin-top: 10px; width: 302px;height:45px;padding-top:5px">Go to Home Page</a>
|
||||
<a @click="step = 0" class="btn btn-outline-primary btn-lg"
|
||||
style="margin-top: 10px; width: 302px;height:45px;padding-top:5px">Go to Home Page</a>
|
||||
</p>
|
||||
</div>
|
||||
</div>
|
||||
@ -143,36 +161,46 @@ For individual users, EasySpider is a completely free and ad-free open-source so
|
||||
<div v-if="step == -1">
|
||||
<h4 style="margin-top: 20px">版权声明和注意事项</h4>
|
||||
<p>请接受下方使用协议以使用软件,不同意请退出。</p>
|
||||
<textarea class="form-control" style="margin:0 auto;width:90%; color:black; height: 480px; min-height: 200px; background: white" readonly>
|
||||
本软件仅供学习交流使用,严禁使用软件进行任何违法违规的操作,如爬取不允许爬取的政府/军事机关网站等。使用本软件所造成的一切后果由使用者自负,与作者本人无关,作者不会承担任何责任。同时,软件受到专利权保护,如要用于商业用途,如使用软件进行盈利接单,用于公司业务,或出售采集到的数据等,请邮件联系作者:naibowang@foxmail.com进行专利授权等付费操作:https://www.patentguru.com/cn/search?q=一种自定义提取流程的服务封装系统
|
||||
<textarea class="form-control"
|
||||
style="margin:0 auto;width:90%; color:black; height: 480px; min-height: 200px; background: white"
|
||||
readonly>
|
||||
本软件仅供学习交流使用,严禁使用软件进行任何违法违规的操作,如爬取不允许爬取的政府/军事机关网站等。使用本软件所造成的一切后果由使用者自负,与作者本人无关,作者不会承担任何责任。
|
||||
|
||||
对于个人使用者来说,易采集EasySpider是一款完全免费无广告的开源软件,软件开发和维护全靠作者用爱发电,因此您可以选择支持作者让作者有更多的热情和精力维护此软件,或者您使用了此软件进行了盈利,欢迎您通过下面的方式支持作者:
|
||||
易采集EasySpider是一款完全免费无广告的开源软件,软件开发和维护全靠作者用爱发电,因此您可以选择支持作者让作者有更多的热情和精力维护此软件,或者您使用了此软件进行了盈利,欢迎您通过下面的方式支持作者:
|
||||
|
||||
1、支付宝账号:naibowang@foxmail.com,也可以扫描软件包中带的二维码。
|
||||
2、微信收款:扫描软件包中带的二维码。
|
||||
3、PayPal账号:naibowang,或扫描软件包中带的二维码。
|
||||
</textarea>
|
||||
<p><a @click="acceptAgreement" class="btn btn-primary btn-lg"
|
||||
style="margin-top: 30px; width: 300px;height:60px;padding-top:12px;color:white">同意并开始使用</a></p>
|
||||
style="margin-top: 30px; width: 300px;height:60px;padding-top:12px;color:white">同意并开始使用</a>
|
||||
</p>
|
||||
</div>
|
||||
<div v-if="step == 0">
|
||||
<p style="margin-top: 20px">提示:点击下方按钮开始使用。</p>
|
||||
|
||||
<p><a @click="step = 1" class="btn btn-primary btn-lg"
|
||||
style="margin-top: 15px; width: 300px;height:60px;padding-top:12px;color:white">设计/修改任务</a></p>
|
||||
style="margin-top: 15px; width: 300px;height:60px;padding-top:12px;color:white">设计/修改任务</a>
|
||||
</p>
|
||||
|
||||
<p><a @click="startInvoke('zh')"
|
||||
@click class="btn btn-primary btn-lg"
|
||||
style="margin-top: 15px; width: 300px;height:60px;padding-top:12px;color:white">查看/管理/执行任务</a>
|
||||
</p>
|
||||
<p>
|
||||
<a href="https://www.easyspider.cn?lang=zh" target="_blank" style="text-align: center; font-size: 18px">点此访问官网查看文档/视频教程</a>
|
||||
<a href="https://www.easyspider.cn?lang=zh" target="_blank"
|
||||
style="text-align: center; font-size: 18px">点此访问官网查看文档/视频教程</a>
|
||||
</p>
|
||||
<p style="font-size: 17px">软件当前版本:<b>v0.6.3</b></p>
|
||||
<p style="font-size: 17px"><a href="https://github.com/NaiboWang/EasySpider/releases"
|
||||
target="_blank">官网</a>最新版本:<b>{{newest_version}}</b></p>
|
||||
<div class="img-container">
|
||||
<!-- <h5>出品方</h5>-->
|
||||
<!-- <h5>出品方</h5>-->
|
||||
<a href="https://www.zju.edu.cn" alt="浙江大学" target="_blank"><img src="img/zju.png"></a>
|
||||
<a href="https://www.nus.edu.sg" alt= "新加坡国立大学" target="_blank"><img src="img/nuslogo.png"></a>
|
||||
<a href="https://www.xidian.edu.cn" alt="西安电子科技大学" target="_blank"><img src="img/xidian.png"></a>
|
||||
<a href="https://www.nus.edu.sg" alt="新加坡国立大学" target="_blank"><img
|
||||
src="img/nuslogo.png"></a>
|
||||
<a href="https://www.xidian.edu.cn" alt="西安电子科技大学" target="_blank"><img
|
||||
src="img/xidian.png"></a>
|
||||
</div>
|
||||
</div>
|
||||
<div v-else-if="step == 1">
|
||||
@ -194,7 +222,8 @@ For individual users, EasySpider is a completely free and ad-free open-source so
|
||||
style="margin-top: 15px; width: 320px;height:60px;padding-top:12px;color:white">使用带用户信息浏览器设计</a>
|
||||
</p>
|
||||
<p>
|
||||
<a @click="step = 0" class="btn btn-outline-primary btn-lg"style="margin-top: 10px; width: 322px;height:45px;padding-top:5px">返回首页</a>
|
||||
<a @click="step = 0" class="btn btn-outline-primary btn-lg"
|
||||
style="margin-top: 10px; width: 322px;height:45px;padding-top:5px">返回首页</a>
|
||||
</p>
|
||||
|
||||
|
||||
@ -216,9 +245,11 @@ For individual users, EasySpider is a completely free and ad-free open-source so
|
||||
<p>
|
||||
<p><a @click="startDesign('zh', true, true)"
|
||||
class="btn btn-primary btn-lg"
|
||||
style="margin-top: 15px; width: 320px;height:60px;padding-top:12px;color:white">开始设计(手机模式)</a></p>
|
||||
style="margin-top: 15px; width: 320px;height:60px;padding-top:12px;color:white">开始设计(手机模式)</a>
|
||||
</p>
|
||||
<p>
|
||||
<a @click="step = 0" class="btn btn-outline-primary btn-lg"style="margin-top: 10px; width: 322px;height:45px;padding-top:5px">返回首页</a>
|
||||
<a @click="step = 0" class="btn btn-outline-primary btn-lg"
|
||||
style="margin-top: 10px; width: 322px;height:45px;padding-top:5px">返回首页</a>
|
||||
</p>
|
||||
</div>
|
||||
|
||||
|
@ -22,7 +22,7 @@ let app = Vue.createApp({
|
||||
data() {
|
||||
return {
|
||||
init: true,
|
||||
lang: 'zh',
|
||||
lang: '-',
|
||||
user_data_folder: getUrlParam("user_data_folder"),
|
||||
copyright: 0,
|
||||
step: 0,
|
||||
@ -34,6 +34,10 @@ let app = Vue.createApp({
|
||||
if(this.copyright == 0){
|
||||
this.step = -1;
|
||||
}
|
||||
this.lang = getUrlParam("lang");
|
||||
if (this.lang == 'undefined' || this.lang == '') {
|
||||
this.lang = '-';
|
||||
}
|
||||
// 发送GET请求获取GitHub的Release API响应
|
||||
const request = new XMLHttpRequest();
|
||||
request.open('GET', `https://api.github.com/repos/NaiboWang/EasySpider/releases/latest`);
|
||||
@ -52,8 +56,9 @@ let app = Vue.createApp({
|
||||
},
|
||||
methods: {
|
||||
changeLang(lang = 'zh') {
|
||||
this.init = false;
|
||||
// this.init = false;
|
||||
this.lang = lang;
|
||||
window.electronAPI.changeLang(lang);
|
||||
},
|
||||
acceptAgreement() {
|
||||
this.step = 0;
|
||||
|
@ -11,4 +11,5 @@ contextBridge.exposeInMainWorld('electronAPI', {
|
||||
startDesign: (lang="en", user_data_folder = '', mobile=false) => ipcRenderer.send('start-design', lang, user_data_folder, mobile),
|
||||
startInvoke: (lang="en") => ipcRenderer.send('start-invoke', lang),
|
||||
acceptAgreement: () => ipcRenderer.send('accept-agreement'),
|
||||
changeLang: (lang="en") => ipcRenderer.send('change-lang', lang)
|
||||
})
|
@ -89,7 +89,7 @@
|
||||
</div>
|
||||
|
||||
<div>
|
||||
<label>Tip: Hover over the smiley face to view hints, <b>double-click</b> on an action in the flowchart to test run, <b>right-click</b> on an action to see more options.</label>
|
||||
<label>Tip: Hover over the smiley face to view hints, <b>double-click</b> on an action in the flowchart to <b>trial run</b> (can only run when the webpage is fully loaded), <b>right-click</b> on an action to see more options.</label>
|
||||
<label>Option Name:</label>
|
||||
<input spellcheck=false onkeydown="inputDelete(event)" class="form-control" v-model='list.nl[index.nowNodeIndex]["title"]'></input>
|
||||
</div>
|
||||
@ -145,39 +145,12 @@
|
||||
</div>
|
||||
<div>
|
||||
<label>XPath (Or use "point(10,10)" to represent clicking on the web page at coordinate position (10, 10), suitable for the situation when need to click on a blank area to leave popup dialog): <span style="font-size: 30px!important;" title="Relative XPATH writing: start with /, e.g. the loop item XPATH is /html/body/div[1], your input is /*[@id='tab-customer'], then the final addressed xpath is: /html/body/div[1]/*[@id='tab-customer']">☺</span></label>
|
||||
<textarea spellcheck=false onkeydown="inputDelete(event)" class="form-control" rows="2" v-model='nowNode["parameters"]["xpath"]'></textarea>
|
||||
<textarea spellcheck=false onkeydown="inputDelete(event)" class="form-control" rows="2" v-model='xpath'></textarea>
|
||||
<p><button type="button" data-toggle="modal" data-target="#myModal_XPath" @click="changeXPaths(nowNode['parameters']['allXPaths'])" class="btn btn-primary" style="margin-top: 10px">Click here to view other equivalent XPath expressions</button></p>
|
||||
<label>The final XPath of this element when the task is running:</label>
|
||||
<textarea v-model="getFinalXPath(nowNode['parameters']['xpath'], useLoop)" spellcheck="false" onkeydown="inputDelete(event)" class="form-control" rows="2" readonly style="background:ghostwhite"></textarea>
|
||||
</div>
|
||||
<label>Maximum wait time for page load after clicking (in seconds):</label>
|
||||
<input spellcheck=false onkeydown="inputDelete(event)" class="form-control" v-model.number="nowNode['parameters']['maxWaitTime']" type="number" required></input>
|
||||
<label>Click Type:</label>
|
||||
<select v-model='nowNode["parameters"]["clickWay"]' class="form-control">
|
||||
<option :value = 0>Selenium</option>
|
||||
<option :value = 1>JavaScript</option>
|
||||
</select>
|
||||
<label>Open link in new tab:</label>
|
||||
<select v-model='nowNode["parameters"]["newTab"]' class="form-control">
|
||||
<option :value = 1>Yes</option>
|
||||
<option :value = 0>No</option>
|
||||
</select>
|
||||
<label>Whether to scroll down after clicking:</label>
|
||||
<select v-model='nowNode["parameters"]["scrollType"]' class="form-control">
|
||||
<option :value = 0>No Scrolling</option>
|
||||
<option :value = 1>Scroll one screen</option>
|
||||
<option :value = 2>Scroll to the end</option>
|
||||
<option :value = 3>Keep scrolling until the page data does not change</option>
|
||||
</select>
|
||||
<label>Scroll Times:</label>
|
||||
<input spellcheck=false onkeydown="inputDelete(event)" class="form-control" v-model.number="nowNode['parameters']['scrollCount']" type="number" required></input>
|
||||
<label>Wait time after scrolling (in seconds):</label>
|
||||
<input spellcheck=false onkeydown="inputDelete(event)" class="form-control" v-model.number="nowNode['parameters']['scrollWaitTime']" type="number" required></input>
|
||||
<label>Way to handle pop-up windows after clicking:</label>
|
||||
<p><select v-model='nowNode["parameters"]["alertHandleType"]' class="form-control">
|
||||
<option :value = 0>No pop-up window</option>
|
||||
<option :value = 1>Accept pop-up window</option>
|
||||
<option :value = 2>Reject pop-up window (only for Confirm pop-up window)</option>
|
||||
</select></p>
|
||||
<p style="margin-top: 10px">
|
||||
<p style="margin-top: 10px">
|
||||
<a class="btn btn-primary" data-toggle="collapse" href="#collapseExample" role="button" aria-expanded="false" aria-controls="collapseExample">
|
||||
Click here to expand/collapse advanced operations
|
||||
</a>
|
||||
@ -195,6 +168,38 @@
|
||||
<input spellcheck=false onkeydown="inputDelete(event)" required class="form-control" type="number" v-model.number='nowNode["parameters"]["afterJSWaitTime"]'></input>
|
||||
</div>
|
||||
</div>
|
||||
<label>Maximum wait time for page load after clicking (in seconds):</label>
|
||||
<input spellcheck=false onkeydown="inputDelete(event)" class="form-control" v-model.number="nowNode['parameters']['maxWaitTime']" type="number" required></input>
|
||||
<label>Click Type (including double-click):</label>
|
||||
<select v-model='nowNode["parameters"]["clickWay"]' class="form-control">
|
||||
<option :value = 0>Selenium</option>
|
||||
<option :value = 1>JavaScript</option>
|
||||
<option :value = 2>Double-click</option>
|
||||
</select>
|
||||
<label>Open link in new tab:</label>
|
||||
<select v-model='nowNode["parameters"]["newTab"]' class="form-control">
|
||||
<option :value = 1>Yes</option>
|
||||
<option :value = 0>No</option>
|
||||
</select>
|
||||
<label>Whether to scroll down after clicking:</label>
|
||||
<select v-model='nowNode["parameters"]["scrollType"]' class="form-control">
|
||||
<option :value = 0>No Scrolling</option>
|
||||
<option :value = 1>Scroll one screen</option>
|
||||
<option :value = 2>Scroll to the end</option>
|
||||
<option :value = 3>Keep scrolling until the page data does not change</option>
|
||||
</select>
|
||||
<label>Scroll Times:</label>
|
||||
<input spellcheck=false onkeydown="inputDelete(event)" class="form-control" v-model.number="nowNode['parameters']['scrollCount']" type="number" required></input>
|
||||
<label>Wait time after scrolling (in seconds):</label>
|
||||
<input spellcheck=false onkeydown="inputDelete(event)" class="form-control" v-model.number="nowNode['parameters']['scrollWaitTime']" type="number" required></input>
|
||||
<label>Maximum file download wait time (in seconds):</label>
|
||||
<input spellcheck="false" onkeydown="inputDelete(event)" class="form-control" v-model.number="nowNode['parameters']['downloadWaitTime']" type="number" required></input>
|
||||
<label>Way to handle pop-up windows after clicking:</label>
|
||||
<p><select v-model='nowNode["parameters"]["alertHandleType"]' class="form-control">
|
||||
<option :value = 0>No pop-up window</option>
|
||||
<option :value = 1>Accept pop-up window</option>
|
||||
<option :value = 2>Reject pop-up window (only for Confirm pop-up window)</option>
|
||||
</select></p>
|
||||
|
||||
|
||||
|
||||
@ -237,6 +242,9 @@
|
||||
<p>XPATH (Field["FieldName"] and eval("your code") can be used in any XPATHS): <span style="font-size: 30px!important;" title="Relative XPATH writing: start with /, e.g. the loop item XPATH is /html/body/div[1], your input is /*[@id='tab-customer'], then the final addressed xpath is: /html/body/div[1]/*[@id='tab-customer']">☺</span></p>
|
||||
<textarea spellcheck=false onkeydown="inputDelete(event)" class="form-control" rows="2" v-model='params.parameters[paraIndex]["relativeXPath"]' placeholder="If you want to write the XPath relative to the current element in the loop, you can write as *../div[1] which matches the first div child element of the parent of the current element in the loop."></textarea>
|
||||
<p><button type="button" data-toggle="modal" data-target="#myModal_XPath" @click="changeXPaths(params.parameters[paraIndex]['allXPaths'])" class="btn btn-primary" style="margin-top: 10px">Click here to view other equivalent XPath expressions</button></p>
|
||||
<label>Final XPath of this field when the task is running:</label>
|
||||
<textarea spellcheck="false" onkeydown="inputDelete(event)" class="form-control" rows="2" readonly style="background:ghostwhite">{{getFinalXPath(params.parameters[paraIndex]['relativeXPath'], params.parameters[paraIndex]['relative'])}}</textarea>
|
||||
<div style="margin-top: 10px"><a href="#" v-on:mousedown="trailParam(paraIndex)" style="text-decoration: none">Trail Run (only test the first matched element)</a></div>
|
||||
<p style="margin-top: 10px">
|
||||
<a class="btn btn-primary" data-toggle="collapse" href="#elementAdvanced" role="button" aria-expanded="false" aria-controls="collapseExample">
|
||||
Click here to expand/collapse advanced operations
|
||||
@ -244,7 +252,6 @@
|
||||
</p>
|
||||
<div :class="{collapse: true, 'show': params.parameters[paraIndex]['beforeJS'].length!=0 || params.parameters[paraIndex]['afterJS'].length!=0}" id="elementAdvanced">
|
||||
<div>
|
||||
<div><a href="#" v-on:mousedown="trailParam(paraIndex)" style="text-decoration: none">Trail Run</a></div>
|
||||
<label>Execute a JavaScript script <strong>before</strong> extracting data from this element: </label>
|
||||
<textarea spellcheck=false onkeydown="inputDelete(event)" class="form-control" rows="2"
|
||||
placeholder='The element should be represented by arguments[0]. Here is an example JavaScript code: arguments[0].innerText = arguments[0].innerText.replace("United States","US"). This code replaces occurrences of "United States" with "US" in the text of the element. Subsequently, when extracting data, you will obtain the replaced value.' v-model='params.parameters[paraIndex]["beforeJS"]'></textarea>
|
||||
@ -256,21 +263,6 @@
|
||||
<input spellcheck=false onkeydown="inputDelete(event)" required class="form-control" type="number" v-model.number='params.parameters[paraIndex]["afterJSWaitTime"]'></input>
|
||||
</div>
|
||||
</div>
|
||||
<label>Parameter type conversion (for Excel and Database):</label>
|
||||
<select v-model='params.parameters[paraIndex]["paraType"]' class="form-control">
|
||||
<option value = "text">Text (for single values estimated to exceed 10,000 in length, please choose Large Text)</option>
|
||||
<option value = "int">Integer (up to 9 digits)</option>
|
||||
<option value = "double">Floating Number (Decimal)</option>
|
||||
<option value = "mediumText">Large Text (single value length exceeding 10,000 but less than 1,000,000)</option>
|
||||
<option value = "datetime">Date Time</option>
|
||||
<option value = "date">Date</option>
|
||||
<option value = "time">Time</option>
|
||||
<option value = "varchar">Small Text (single value length less than 50)</option>
|
||||
<option value = "longText">Extra Large Text (single value length exceeding 1,000,000)</option>
|
||||
<option value = "bigInt">Large Integer (more than 9 digits)</option>
|
||||
</select>
|
||||
<label>Default value when cannot find this element:</label>
|
||||
<input spellcheck=false onkeydown="inputDelete(event)" class="form-control" v-model='params.parameters[paraIndex]["default"]'></textarea>
|
||||
<label>Extract Type</label>
|
||||
<select v-model='params.parameters[paraIndex]["contentType"]' class="form-control">
|
||||
<option :value = 0>Text (include child element)</option>
|
||||
@ -280,6 +272,7 @@
|
||||
<option :value = 4>Background Image Address</option>
|
||||
<option :value = 5>Webpage URL</option>
|
||||
<option :value = 6>Webpage Title</option>
|
||||
<option :value = 15>Constant String</option>
|
||||
<option :value = 7>Element Screenshot</option>
|
||||
<option :value = 8>OCR Results</option>
|
||||
<option :value = 14>Properties of elements</option>
|
||||
@ -289,7 +282,11 @@
|
||||
<option :value = 10>Selected value of the current select box</option>
|
||||
<option :value = 11>Selected text of the current select box</option>
|
||||
</select>
|
||||
<div v-if='params.parameters[paraIndex]["contentType"] == 14'>
|
||||
<div v-if='params.parameters[paraIndex]["contentType"] == 15'>
|
||||
<label>Constant String:</label>
|
||||
<input spellcheck=false onkeydown="inputDelete(event)" class="form-control" v-model='params.parameters[paraIndex]["JS"]' placeholder="This field type is usually used for remarks"></input>
|
||||
</div>
|
||||
<div v-else-if='params.parameters[paraIndex]["contentType"] == 14'>
|
||||
<label>Attribute Name:</label>
|
||||
<input spellcheck=false onkeydown="inputDelete(event)" class="form-control" v-model='params.parameters[paraIndex]["JS"]' placeholder="Attribute names, such as href to represent the href attribute of the current element, that is, the link address."></input>
|
||||
</div>
|
||||
@ -320,6 +317,21 @@
|
||||
<!-- <option :value = 0>普通提取</option>-->
|
||||
<!-- <option :value = 1>OCR提取</option>-->
|
||||
<!-- </select>-->
|
||||
<label>Parameter type conversion (for Excel and Database):</label>
|
||||
<select v-model='params.parameters[paraIndex]["paraType"]' class="form-control">
|
||||
<option value = "text">Text (for single values estimated to exceed 10,000 in length, please choose Large Text)</option>
|
||||
<option value = "int">Integer (up to 9 digits)</option>
|
||||
<option value = "double">Floating Number (Decimal)</option>
|
||||
<option value = "mediumText">Large Text (single value length exceeding 10,000 but less than 1,000,000)</option>
|
||||
<option value = "datetime">Date Time</option>
|
||||
<option value = "date">Date</option>
|
||||
<option value = "time">Time</option>
|
||||
<option value = "varchar">Small Text (single value length less than 50)</option>
|
||||
<option value = "longText">Extra Large Text (single value length exceeding 1,000,000)</option>
|
||||
<option value = "bigInt">Large Integer (more than 9 digits)</option>
|
||||
</select>
|
||||
<label>Default value when cannot find this element:</label>
|
||||
<input spellcheck=false onkeydown="inputDelete(event)" class="form-control" v-model='params.parameters[paraIndex]["default"]'></textarea>
|
||||
<label style="margin-top: 15px">Wrap content to new line (set when collecting long articles and wanting to wrap):</label>
|
||||
<select v-model='params.parameters[paraIndex]["splitLine"]' class="form-control">
|
||||
<option :value="0">No</option>
|
||||
@ -388,8 +400,11 @@
|
||||
<option :value = 5>Run Python code on current environment (the "exec" operation)</option>
|
||||
<option :value = 6>Get value of a Python expression (the "eval" operation)</option>
|
||||
<option :value = 7>Pause program execution (such as when the captcha box appears)</option>
|
||||
<option :value = 12>Exit Program</option>
|
||||
<option :value = 8>Refresh page</option>
|
||||
<option :value = 9>Send Email</option>
|
||||
<option :value = 10>Clear all field values</option>
|
||||
<option :value = 11>Generate new data row</option>
|
||||
</select>
|
||||
<div v-if='nowNode["parameters"]["codeMode"] < 3 || nowNode["parameters"]["codeMode"] >= 5 && nowNode["parameters"]["codeMode"] <=6'>
|
||||
<label>Code (Use Field["FieldName"] to input the lastest value of a field): </label>
|
||||
@ -480,7 +495,12 @@ Please note that this feature does not support assigning values to variables. In
|
||||
<label>Email content:</label>
|
||||
<textarea spellcheck=false onkeydown="inputDelete(event)" class="form-control" rows="2" v-model='nowNode["parameters"]["emailConfig"]["content"]' placeholder="Write the email content here"></textarea>
|
||||
</div>
|
||||
|
||||
<div v-if='nowNode["parameters"]["codeMode"] == 10'>
|
||||
<label>This action can clear all field values, such as when used before starting a web scraping task to clear all values.</label>
|
||||
</div>
|
||||
<div v-if='nowNode["parameters"]["codeMode"] == 11'>
|
||||
<label>This action can generate a new row of data, such as when designing a web scraping task to not generate rows of data temporarily, and instead generate a new row of data once all fields have been extracted.</label>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="elements" v-if="nodeType==6">
|
||||
@ -519,6 +539,8 @@ Please note that this feature does not support assigning values to variables. In
|
||||
<label>XPath: </label>
|
||||
<textarea spellcheck=false onkeydown="inputDelete(event)" class="form-control" rows="2" v-model='nowNode["parameters"]["xpath"]'></textarea>
|
||||
<p><button type="button" data-toggle="modal" data-target="#myModal_XPath" @click="changeXPaths(nowNode['parameters']['allXPaths'])" class="btn btn-primary" style="margin-top: 10px">Click here to view other equivalent XPath expressions</button></p>
|
||||
<label>The final XPath of this element when the task is running:</label>
|
||||
<textarea v-model="getFinalXPath(nowNode['parameters']['xpath'], useLoop)" spellcheck="false" onkeydown="inputDelete(event)" class="form-control" rows="2" readonly style="background:ghostwhite"></textarea>
|
||||
</div>
|
||||
|
||||
|
||||
@ -558,7 +580,7 @@ Please note that this feature does not support assigning values to variables. In
|
||||
Loop based on the expression value of Python code. Here are some examples:
|
||||
1. Return relevant values of the current browser object. Use `self.browser` to refer to the current browser being operated. You can directly use Selenium's API to perform operations, such as `self.browser.find_element(By.CSS_SELECTOR, "body").text=="123"`, which checks whether the current page contains the text "123".
|
||||
2. Return the value of a custom global variable: `self.myVar`
|
||||
3. Return the result of a conditional statement: `self.myVar == 1`
|
||||
3. Return the result of a conditional statement: `self.myVar > 1`
|
||||
4. Determining whether the value extracted from a certain field is equal to the value of a certain variable: self.outputParameters["field name"] == self.myVar
|
||||
If the expression returns a value greater than 0 or evaluates to True, the loop continues; otherwise, it stops.
|
||||
</pre>
|
||||
@ -700,8 +722,8 @@ If the expression returns a value greater than 0 or evaluates to True, the opera
|
||||
<input spellcheck=false onkeydown="inputDelete(event)" id="serviceDescription" name="serviceDescription" class="form-control"></input>
|
||||
<label>Export Data Format (Excel/CSV/TXT/Database):</label>
|
||||
<select id="outputFormat" class="form-control">
|
||||
<option value="xlsx">XLSX (Excel file, recommended use CSV format when single cell exceeds 500 characters)</option>
|
||||
<option value="csv">CSV (Recommended for collecting long articles)</option>
|
||||
<option value="xlsx">XLSX (Excel file, recommended use CSV format when single cell exceeds 500 characters)</option>
|
||||
<option value="txt">TXT</option>
|
||||
<option value="json">JSON</option>
|
||||
<option value="mysql">MySQL Database (recommended for large amounts of data)</option>
|
||||
@ -712,6 +734,7 @@ If the expression returns a value greater than 0 or evaluates to True, the opera
|
||||
<select id="dataWriteMode" name="dataWriteMode" class="form-control">
|
||||
<option value="1">Append (If the file exists, append to it)</option>
|
||||
<option value="2">Overwrite (If the file exists, overwrite it)</option>
|
||||
<option value=3>Rename on Write (renames file if it already exists)</option>
|
||||
</select>
|
||||
<!-- <label>Is it an extreme anti-scraping website like Cloudflare (<a href="https://www.bilibili.com/video/BV1Ph4y1E7R9/" target="_blank">Watch Tutorial</a>)?</label>-->
|
||||
<!-- <select id="cloudflare" name="cloudflare" class="form-control">-->
|
||||
|
@ -46,6 +46,7 @@ let app = new Vue({
|
||||
index: vueData,
|
||||
nodeType: 0, // 当前元素的类型
|
||||
nowNode: null, // 用来临时存储元素的节点
|
||||
parentNode: null, // 用来临时存储元素的父节点
|
||||
codeMode: -1, //代码模式
|
||||
loopType: -1, //点击循环时候用来循环选项
|
||||
useLoop: false, //记录是否使用循环内元素
|
||||
@ -53,6 +54,7 @@ let app = new Vue({
|
||||
params: {"parameters": []}, //提取数据的参数列表
|
||||
TClass: -1, //条件分支的条件类别
|
||||
paraIndex: 0, //当前参数的index
|
||||
xpath: "", //当前操作的xpath
|
||||
XPaths: "", //xpath列表
|
||||
},
|
||||
mounted: function () {
|
||||
@ -62,6 +64,12 @@ let app = new Vue({
|
||||
// console.log("scroll")
|
||||
// }, 500);
|
||||
},
|
||||
// computed: {
|
||||
// finalXPath: function () {
|
||||
// console.log("Call finalXPath")
|
||||
// return this.getFinalXPath(this.nowNode["parameters"]["xpath"], this.nowNode["parameters"]["useLoop"]);
|
||||
// }
|
||||
// },
|
||||
watch: {
|
||||
nowArrow: { //变量发生变化的时候进行一些操作
|
||||
deep: true,
|
||||
@ -91,6 +99,11 @@ let app = new Vue({
|
||||
updateUI();
|
||||
}
|
||||
},
|
||||
'nowNode.parameters.xpath': { //xpath发生变化的时候更新参数值
|
||||
handler: function (newVal, oldVal) {
|
||||
console.log("xpath changed", newVal, oldVal);
|
||||
}
|
||||
},
|
||||
loopType: { //循环类型发生变化的时候更新参数值
|
||||
handler: function (newVal, oldVal) {
|
||||
// this.nowNode["parameters"]["loopType"] = newVal;
|
||||
@ -106,6 +119,11 @@ let app = new Vue({
|
||||
this.nowNode["parameters"]["useLoop"] = newVal;
|
||||
}
|
||||
},
|
||||
xpath: {
|
||||
handler: function (newVal, oldVal) {
|
||||
this.nowNode["parameters"]["xpath"] = newVal;
|
||||
}
|
||||
},
|
||||
params: {
|
||||
handler: function (newVal, oldVal) {
|
||||
this.nowNode["parameters"]["params"] = newVal["parameters"];
|
||||
@ -123,6 +141,26 @@ let app = new Vue({
|
||||
}
|
||||
},
|
||||
methods: {
|
||||
getFinalXPath: function (xpath, useLoop) { //获取最终的xpath
|
||||
// console.log(xpath, useLoop, this.parentNode);
|
||||
if (this.parentNode == null || this.parentNode.parameters == null || this.parentNode.parameters.xpath == null) {
|
||||
return xpath;
|
||||
} else if (useLoop) {
|
||||
let parent_xpath = this.parentNode.parameters.xpath;
|
||||
let final_xpath = "";
|
||||
final_xpath = parent_xpath + xpath;
|
||||
if (this.parentNode.parameters.loopType == 2) {
|
||||
parent_xpath = this.parentNode.parameters.pathList.split("\n");
|
||||
final_xpath = "";
|
||||
for (let i = 0; i < parent_xpath.length; i++) {
|
||||
final_xpath += parent_xpath[i] + xpath + "\n";
|
||||
}
|
||||
}
|
||||
return final_xpath;
|
||||
} else {
|
||||
return xpath;
|
||||
}
|
||||
},
|
||||
handleCodeModeChange: function () {
|
||||
// if (this.codeMode == undefined || this.codeMode == null || this.codeMode == -1) {
|
||||
// return;
|
||||
@ -137,7 +175,7 @@ let app = new Vue({
|
||||
this.nowNode["title"] = LANG("运行操作系统命令", "Run OS Command");
|
||||
break;
|
||||
case 2:
|
||||
this.nowNode["title"] = LANG("执行JavaScript", "Run JavaScript");
|
||||
this.nowNode["title"] = LANG("循环内元素执行JS", "Run JS in Loop");
|
||||
break;
|
||||
case 3:
|
||||
this.nowNode["title"] = LANG("退出循环", "Exit Loop");
|
||||
@ -160,6 +198,15 @@ let app = new Vue({
|
||||
case 9:
|
||||
this.nowNode["title"] = LANG("发送邮件", "Send Email");
|
||||
break;
|
||||
case 10:
|
||||
this.nowNode["title"] = LANG("清空字段值", "Clear Field Value");
|
||||
break;
|
||||
case 11:
|
||||
this.nowNode["title"] = LANG("生成新行", "Generate New Row");
|
||||
break;
|
||||
case 12:
|
||||
this.nowNode["title"] = LANG("退出程序", "Exit Program");
|
||||
break;
|
||||
case -1: // 跳转到其他操作时,不改变标题
|
||||
break;
|
||||
default: // 默认情况
|
||||
@ -433,7 +480,7 @@ function operationChange(e, theNode) {
|
||||
if (nowNode != null) {
|
||||
nowNode.style.borderColor = "skyblue";
|
||||
}
|
||||
nowNode = theNode
|
||||
nowNode = theNode;
|
||||
vueData.nowNodeIndex = actionSequence[theNode.getAttribute("data")];
|
||||
theNode.style.borderColor = "blue";
|
||||
handleElement(); //处理元素
|
||||
@ -467,7 +514,7 @@ function elementDblClick(e) {
|
||||
showInfo(LANG("试运行功能不适用于循环操作,请试运行循环内部的具体操作,如点击元素。", "The trial run function is not applicable to loop operations. Please try to run the specific operations in the loop, such as clicking elements."));
|
||||
}
|
||||
} else {
|
||||
if (nodeType == 5 && (app._data.nowNode["parameters"]["codeMode"] != 0 && app._data.nowNode["parameters"]["codeMode"] != 8)) {
|
||||
if (nodeType == 5 && (app._data.nowNode["parameters"]["codeMode"] != 0 && app._data.nowNode["parameters"]["codeMode"] != 2 && app._data.nowNode["parameters"]["codeMode"] != 8)) {
|
||||
showInfo(LANG("试运行自定义操作功能只适用于执行JavaScript和刷新页面操作。", "The trial run custom action function is only applicable to run JavaScript and refresh page operations."));
|
||||
} else {
|
||||
trailElement(app._data.nowNode, 1);
|
||||
@ -505,8 +552,7 @@ function toolBoxKernel(e, param = null) {
|
||||
// let tarrow = DeepClone(app.$data.nowArrow);
|
||||
// refresh();
|
||||
// app._data.nowArrow =tarrow;
|
||||
}
|
||||
else if (option == 11) { //复制操作
|
||||
} else if (option == 11) { //复制操作
|
||||
if (nowNode == null) {
|
||||
e.stopPropagation(); //防止冒泡
|
||||
} else if (nowNode.getAttribute("dataType") > 0) {
|
||||
@ -528,8 +574,7 @@ function toolBoxKernel(e, param = null) {
|
||||
$("#" + t["id"]).click(); //复制后点击复制后的元素
|
||||
e.stopPropagation(); //防止冒泡
|
||||
}
|
||||
}
|
||||
else if (option == 10) { //剪切操作
|
||||
} else if (option == 10) { //剪切操作
|
||||
if (nowNode == null) {
|
||||
e.stopPropagation(); //防止冒泡
|
||||
} else if ($(nowNode).is(".branch")) {
|
||||
@ -574,8 +619,7 @@ function toolBoxKernel(e, param = null) {
|
||||
e.stopPropagation(); //防止冒泡
|
||||
}
|
||||
}
|
||||
}
|
||||
else if (option > 0) { //新增操作
|
||||
} else if (option > 0) { //新增操作
|
||||
let l = nodeList.length;
|
||||
let nt = null;
|
||||
let nt2 = null;
|
||||
@ -676,13 +720,13 @@ function toolBoxKernel(e, param = null) {
|
||||
} else {
|
||||
$("#" + t["id"]).click();
|
||||
}
|
||||
|
||||
if (e != null)
|
||||
if (e != null) {
|
||||
e.stopPropagation(); //防止冒泡
|
||||
}
|
||||
option = 0;
|
||||
return t;
|
||||
}
|
||||
option = 0;
|
||||
updateParentNode();
|
||||
}
|
||||
|
||||
$(".options").mousedown(function () {
|
||||
@ -935,4 +979,4 @@ function inputDelete(e) {
|
||||
e.stopPropagation(); //输入框按delete应该正常运行
|
||||
//Electron中如果有showError或者confirm,执行后会卡死输入框,所以最好不要用
|
||||
}
|
||||
}
|
||||
}
|
@ -89,7 +89,7 @@
|
||||
</div>
|
||||
|
||||
<div>
|
||||
<label>提示:鼠标移到笑脸可查看提示,在流程图中<b>双击</b>操作可试运行,<b>右键</b>点击操作查看更多选项。</label>
|
||||
<label>提示:鼠标移到笑脸可查看提示,在流程图中<b>双击</b>操作可<b>试运行</b>(页面完全加载完毕后),<b>右键</b>点击操作查看更多选项。</label>
|
||||
<label>选项名称</span>:</label>
|
||||
<input spellcheck=false onkeydown="inputDelete(event)" class="form-control" v-model='list.nl[index.nowNodeIndex]["title"]'></input>
|
||||
</div>
|
||||
@ -145,38 +145,11 @@
|
||||
</div>
|
||||
<div>
|
||||
<label>XPath(或者用point(10,10)表示点击网页坐标位置(10, 10)以用来点击空白区域推出弹窗对话框文本列表等): <span style="font-size: 30px!important;" title="相对XPATH写法:以/开头,如循环项XPATH为/html/body/div[1],您的输入为/*[@id='tab-customer'],则最终寻址的xpath为:/html/body/div[1]/*[@id='tab-customer']">☺</span></label>
|
||||
<textarea spellcheck=false onkeydown="inputDelete(event)" class="form-control" rows="2" v-model='nowNode["parameters"]["xpath"]'></textarea>
|
||||
<textarea spellcheck=false onkeydown="inputDelete(event)" class="form-control" rows="2" v-model='xpath'></textarea>
|
||||
<p><button type="button" data-toggle="modal" data-target="#myModal_XPath" @click="changeXPaths(nowNode['parameters']['allXPaths'])" class="btn btn-primary" style="margin-top: 10px">点此查看其他等价的XPath</button></p>
|
||||
<label>任务运行时最终定位的本元素XPath:</label>
|
||||
<textarea v-model="getFinalXPath(nowNode['parameters']['xpath'], useLoop)" spellcheck=false onkeydown="inputDelete(event)" class="form-control" rows="2" readonly style="background:ghostwhite"></textarea>
|
||||
</div>
|
||||
<label>点击后页面加载最长等待时间(秒):</label>
|
||||
<input spellcheck=false onkeydown="inputDelete(event)" class="form-control" v-model.number="nowNode['parameters']['maxWaitTime']" type="number" required></input>
|
||||
<label>点击类型:</label>
|
||||
<select v-model='nowNode["parameters"]["clickWay"]' class="form-control">
|
||||
<option :value = 0>Selenium点击</option>
|
||||
<option :value = 1>JavaScript点击</option>
|
||||
</select>
|
||||
<label>在新标签页打开超链接:</label>
|
||||
<select v-model='nowNode["parameters"]["newTab"]' class="form-control">
|
||||
<option :value = 1>是</option>
|
||||
<option :value = 0>否</option>
|
||||
</select>
|
||||
<label>点击后是否向下滚动页面:</label>
|
||||
<select v-model='nowNode["parameters"]["scrollType"]' class="form-control">
|
||||
<option :value = 0>不滚动</option>
|
||||
<option :value = 1>向下滚动一屏</option>
|
||||
<option :value = 2>滚动到底部</option>
|
||||
<option :value = 3>一直滚动直到页面内容无变化(需设置好滚动后的等待时间用于检测页面变化)</option>
|
||||
</select>
|
||||
<label>滚动次数(滚动类型设置为<b>不滚动</b>或<b>一直滚动</b>时请忽略此项):</label>
|
||||
<input spellcheck=false onkeydown="inputDelete(event)" class="form-control" v-model.number="nowNode['parameters']['scrollCount']" type="number" required></input>
|
||||
<label>滚动后等待时间(秒):</label>
|
||||
<input spellcheck=false onkeydown="inputDelete(event)" class="form-control" v-model.number="nowNode['parameters']['scrollWaitTime']" type="number" required></input>
|
||||
<label>点击元素后如有弹窗出现,弹窗处理方式:</label>
|
||||
<p><select v-model='nowNode["parameters"]["alertHandleType"]' class="form-control">
|
||||
<option :value = 0>无弹窗</option>
|
||||
<option :value = 1>接受弹窗(点击弹窗确定按钮)</option>
|
||||
<option :value = 2>拒绝弹窗(点击弹窗取消按钮,仅限Confirm弹框)</option>
|
||||
</select></p>
|
||||
<p style="margin-top: 10px">
|
||||
<a class="btn btn-primary" data-toggle="collapse" href="#collapseExample" role="button" aria-expanded="false" aria-controls="collapseExample">
|
||||
点此展开/折叠自定义操作
|
||||
@ -195,6 +168,38 @@
|
||||
<input spellcheck=false onkeydown="inputDelete(event)" required class="form-control" type="number" v-model.number='nowNode["parameters"]["afterJSWaitTime"]'></input>
|
||||
</div>
|
||||
</div>
|
||||
<label>点击后页面加载最长等待时间(秒):</label>
|
||||
<input spellcheck=false onkeydown="inputDelete(event)" class="form-control" v-model.number="nowNode['parameters']['maxWaitTime']" type="number" required></input>
|
||||
<label>点击类型(如是否双击):</label>
|
||||
<select v-model='nowNode["parameters"]["clickWay"]' class="form-control">
|
||||
<option :value = 0>Selenium点击</option>
|
||||
<option :value = 1>JavaScript点击</option>
|
||||
<option :value = 2>双击</option>
|
||||
</select>
|
||||
<label>在新标签页打开超链接:</label>
|
||||
<select v-model='nowNode["parameters"]["newTab"]' class="form-control">
|
||||
<option :value = 1>是</option>
|
||||
<option :value = 0>否</option>
|
||||
</select>
|
||||
<label>点击后是否向下滚动页面:</label>
|
||||
<select v-model='nowNode["parameters"]["scrollType"]' class="form-control">
|
||||
<option :value = 0>不滚动</option>
|
||||
<option :value = 1>向下滚动一屏</option>
|
||||
<option :value = 2>滚动到底部</option>
|
||||
<option :value = 3>一直滚动直到页面内容无变化(需设置好滚动后的等待时间用于检测页面变化)</option>
|
||||
</select>
|
||||
<label>滚动次数(滚动类型设置为<b>不滚动</b>或<b>一直滚动</b>时请忽略此项):</label>
|
||||
<input spellcheck=false onkeydown="inputDelete(event)" class="form-control" v-model.number="nowNode['parameters']['scrollCount']" type="number" required></input>
|
||||
<label>滚动后等待时间(秒):</label>
|
||||
<input spellcheck=false onkeydown="inputDelete(event)" class="form-control" v-model.number="nowNode['parameters']['scrollWaitTime']" type="number" required></input>
|
||||
<label>文件下载最长等待时间(秒):</label>
|
||||
<input spellcheck=false onkeydown="inputDelete(event)" class="form-control" v-model.number="nowNode['parameters']['downloadWaitTime']" type="number" required></input>
|
||||
<label>点击元素后如有弹窗出现,弹窗处理方式:</label>
|
||||
<p><select v-model='nowNode["parameters"]["alertHandleType"]' class="form-control">
|
||||
<option :value = 0>无弹窗</option>
|
||||
<option :value = 1>接受弹窗(点击弹窗确定按钮)</option>
|
||||
<option :value = 2>拒绝弹窗(点击弹窗取消按钮,仅限Confirm弹框)</option>
|
||||
</select></p>
|
||||
|
||||
|
||||
|
||||
@ -237,6 +242,9 @@
|
||||
<p>XPath(所有XPath内均可用Field["字段名"]表示参数值,用eval("你的代码")来替换成自定义的变量): <span style="font-size: 30px!important;" title="相对XPATH写法:以/开头,如循环项XPATH为/html/body/div[1],您的输入为/*[@id='tab-customer'],则最终寻址的xpath为:/html/body/div[1]/*[@id='tab-customer']">☺</span></p>
|
||||
<textarea spellcheck=false onkeydown="inputDelete(event)" class="form-control" rows="2" v-model='params.parameters[paraIndex]["relativeXPath"]' placeholder="如果要写相对循环内的xpath,可以写如*../div[1]即匹配当前循环元素的父元素的第一个div子元素"></textarea>
|
||||
<p><button type="button" data-toggle="modal" data-target="#myModal_XPath" @click="changeXPaths(params.parameters[paraIndex]['allXPaths'])" class="btn btn-primary" style="margin-top: 10px">点此查看其他等价的XPath</button></p>
|
||||
<label>任务运行时最终定位的本字段XPath:</label>
|
||||
<textarea spellcheck=false onkeydown="inputDelete(event)" class="form-control" rows="2" readonly style="background:ghostwhite">{{getFinalXPath(params.parameters[paraIndex]['relativeXPath'], params.parameters[paraIndex]['relative'])}}</textarea>
|
||||
<div style="margin-top: 10px"><a href="#" v-on:mousedown="trailParam(paraIndex)" style="text-decoration: none">试运行(只测试第一个匹配到的元素)</a></div>
|
||||
<p style="margin-top: 10px">
|
||||
<a class="btn btn-primary" data-toggle="collapse" href="#elementAdvanced" role="button" aria-expanded="false" aria-controls="collapseExample">
|
||||
点此展开/折叠自定义操作
|
||||
@ -244,7 +252,6 @@
|
||||
</p>
|
||||
<div :class="{collapse: true, 'show': params.parameters[paraIndex]['beforeJS'].length!=0 || params.parameters[paraIndex]['afterJS'].length!=0}" id="elementAdvanced">
|
||||
<div>
|
||||
<div><a href="#" v-on:mousedown="trailParam(paraIndex)" style="text-decoration: none">试运行</a></div>
|
||||
<label>提取该元素数据<strong>前</strong>针对该元素执行一段JavaScript脚本: </label>
|
||||
<textarea spellcheck=false onkeydown="inputDelete(event)" class="form-control" rows="2"
|
||||
placeholder='该元素用arguments[0]来表示,示例JS代码:arguments[0].innerText = arguments[0].innerText.replace("上海","Shanghai")即实现了将元素文字中的“上海”替换成”Shanghai“的功能,然后后续如提取数据时就会提取到替换后的值。' v-model='params.parameters[paraIndex]["beforeJS"]'></textarea>
|
||||
@ -256,21 +263,6 @@
|
||||
<input spellcheck=false onkeydown="inputDelete(event)" required class="form-control" type="number" v-model.number='params.parameters[paraIndex]["afterJSWaitTime"]'></input>
|
||||
</div>
|
||||
</div>
|
||||
<label>参数类型转换为(用于Excel和数据库):</label>
|
||||
<select v-model='params.parameters[paraIndex]["paraType"]' class="form-control">
|
||||
<option value = "text">文本(单个值长度预估超过1万请选择大文本)</option>
|
||||
<option value = "int">整数(位数在9位以内)</option>
|
||||
<option value = "double">浮点数(小数)</option>
|
||||
<option value = "mediumText">大文本(单个值长度超过1万低于100万)</option>
|
||||
<option value = "datetime">日期时间</option>
|
||||
<option value = "date">日期</option>
|
||||
<option value = "time">时间</option>
|
||||
<option value = "varchar">小文本(单个值长度小于50)</option>
|
||||
<option value = "longText">超大文本(单个值长度超过100万)</option>
|
||||
<option value = "bigInt">大整数(位数超过9位)</option>
|
||||
</select>
|
||||
<label>元素找不到时的值:</label>
|
||||
<input spellcheck=false onkeydown="inputDelete(event)" class="form-control" v-model='params.parameters[paraIndex]["default"]'></input>
|
||||
<label>采集内容类型</label>
|
||||
<select v-model='params.parameters[paraIndex]["contentType"]' class="form-control">
|
||||
<option :value = 0>文本(包括子元素)</option>
|
||||
@ -280,6 +272,7 @@
|
||||
<option :value = 4>背景图片地址</option>
|
||||
<option :value = 5>页面网址</option>
|
||||
<option :value = 6>页面标题</option>
|
||||
<option :value = 15>常量字符串</option>
|
||||
<option :value = 7>元素截图</option>
|
||||
<option :value = 8>OCR识别文字</option>
|
||||
<option :value = 14>元素的属性值</option>
|
||||
@ -289,7 +282,11 @@
|
||||
<option :value = 10>当前选择框选中的选项值</option>
|
||||
<option :value = 11>当前选择框选中的选项文本</option>
|
||||
</select>
|
||||
<div v-if='params.parameters[paraIndex]["contentType"] == 14'>
|
||||
<div v-if='params.parameters[paraIndex]["contentType"] == 15'>
|
||||
<label>常量字符串:</label>
|
||||
<input spellcheck=false onkeydown="inputDelete(event)" class="form-control" v-model='params.parameters[paraIndex]["JS"]' placeholder="此字段类型通常作为备注使用"></input>
|
||||
</div>
|
||||
<div v-else-if='params.parameters[paraIndex]["contentType"] == 14'>
|
||||
<label>属性名称:</label>
|
||||
<input spellcheck=false onkeydown="inputDelete(event)" class="form-control" v-model='params.parameters[paraIndex]["JS"]' placeholder="属性名称,如class表示当前元素的class属性值,即元素所拥有的类名。"></input>
|
||||
</div>
|
||||
@ -320,6 +317,21 @@
|
||||
<!-- <option :value = 0>普通提取</option>-->
|
||||
<!-- <option :value = 1>OCR提取</option>-->
|
||||
<!-- </select>-->
|
||||
<label>参数类型转换为(用于Excel和数据库):</label>
|
||||
<select v-model='params.parameters[paraIndex]["paraType"]' class="form-control">
|
||||
<option value = "text">文本(单个值长度预估超过1万请选择大文本)</option>
|
||||
<option value = "int">整数(位数在9位以内)</option>
|
||||
<option value = "double">浮点数(小数)</option>
|
||||
<option value = "mediumText">大文本(单个值长度超过1万低于100万)</option>
|
||||
<option value = "datetime">日期时间</option>
|
||||
<option value = "date">日期</option>
|
||||
<option value = "time">时间</option>
|
||||
<option value = "varchar">小文本(单个值长度小于50)</option>
|
||||
<option value = "longText">超大文本(单个值长度超过100万)</option>
|
||||
<option value = "bigInt">大整数(位数超过9位)</option>
|
||||
</select>
|
||||
<label>元素找不到时的值:</label>
|
||||
<input spellcheck=false onkeydown="inputDelete(event)" class="form-control" v-model='params.parameters[paraIndex]["default"]'></input>
|
||||
<label style="margin-top: 15px">是否将内容换行(长文章采集想要换行时设置):</label>
|
||||
<select v-model='params.parameters[paraIndex]["splitLine"]' class="form-control">
|
||||
<option :value = 0>否</option>
|
||||
@ -388,8 +400,11 @@
|
||||
<option :value = 5>在执行环境下运行Python代码(exec操作)</option>
|
||||
<option :value = 6>在执行环境下获得Python表达式值(eval操作)</option>
|
||||
<option :value = 7>暂停程序执行(如检测到验证码框出现时暂停执行)</option>
|
||||
<option :value = 12>退出程序</option>
|
||||
<option :value = 8>刷新页面</option>
|
||||
<option :value = 9>发送邮件</option>
|
||||
<option :value = 10>清空所有字段值</option>
|
||||
<option :value = 11>生成新数据行</option>
|
||||
</select>
|
||||
<div v-if='nowNode["parameters"]["codeMode"] < 3 || nowNode["parameters"]["codeMode"] >= 5 && nowNode["parameters"]["codeMode"] <=6'>
|
||||
<label>代码/脚本内容(用Field["字段名"]来输入某字段/自定义操作的最新提取/返回值): </label>
|
||||
@ -480,7 +495,12 @@ print(emotlib.emoji()) # 使用其中的函数。
|
||||
<label>邮件内容:</label>
|
||||
<textarea spellcheck=false onkeydown="inputDelete(event)" class="form-control" rows="2" v-model='nowNode["parameters"]["emailConfig"]["content"]' placeholder="这里写邮件内容"></textarea>
|
||||
</div>
|
||||
|
||||
<div v-if='nowNode["parameters"]["codeMode"] == 10'>
|
||||
<label>此操作可以清空所有字段值,如用于爬虫任务开始前清空所有字段值。</label>
|
||||
</div>
|
||||
<div v-if='nowNode["parameters"]["codeMode"] == 11'>
|
||||
<label>此操作可以生成新数据行,如用于爬虫任务设计时暂不生成数据行,等所有字段提取结束后统一生成新数据行。</label>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="elements" v-if="nodeType==6">
|
||||
@ -517,8 +537,10 @@ print(emotlib.emoji()) # 使用其中的函数。
|
||||
</div>
|
||||
<div>
|
||||
<label>XPath: </label>
|
||||
<textarea spellcheck=false onkeydown="inputDelete(event)" class="form-control" rows="2" v-model='nowNode["parameters"]["xpath"]'></textarea>
|
||||
<textarea spellcheck=false onkeydown="inputDelete(event)" class="form-control" rows="2" v-model='xpath'></textarea>
|
||||
<p><button type="button" data-toggle="modal" data-target="#myModal_XPath" @click="changeXPaths(nowNode['parameters']['allXPaths'])" class="btn btn-primary" style="margin-top: 10px">点此查看其他等价的XPath</button></p>
|
||||
<label>任务运行时最终定位的本元素XPath:</label>
|
||||
<textarea v-model="getFinalXPath(nowNode['parameters']['xpath'], useLoop)" spellcheck=false onkeydown="inputDelete(event)" class="form-control" rows="2" readonly style="background:ghostwhite"></textarea>
|
||||
</div>
|
||||
|
||||
|
||||
@ -557,8 +579,8 @@ print(emotlib.emoji()) # 使用其中的函数。
|
||||
<pre class="form-control" style="background: white; margin-top: 20px; min-height: 220px; font-size: 15px!important; word-wrap: break-word; white-space: pre-wrap; border-radius: 0; border: 1px solid" disabled v-if='parseInt(loopType) == 7'>请先阅读此说明,再在上方输入框(不是本框)写具体代码,如果要执行大量代码,可以直接写outside:myCode.py,这样程序就会读取并执行EasySpider目录下的myCode.py中的代码。
|
||||
根据Python代码的表达式值来决定是否循环,示例:
|
||||
1. 返回当前浏览器对象的相关值,用self.browser表示当前操作的浏览器,可直接用selenium的API进行操作,如self.browser.find_element(By.CSS_SELECTOR, "body").text=="123",表示判断当前页面是否为123这个文本。
|
||||
2. 返回自定义全局变量的值:self.myVar,如果
|
||||
3. 返回条件判断的值:self.myVar == 1
|
||||
2. 返回自定义全局变量的值:self.myVar
|
||||
3. 返回条件判断的值:self.myVar > 1
|
||||
4. 判断某个字段提取的值是否等于某个变量的值:self.outputParameters["字段名"] == self.myVar
|
||||
以上表达式返回值大于0或为真则继续循环,否则停止循环。
|
||||
</pre>
|
||||
@ -700,8 +722,8 @@ print(emotlib.emoji()) # 使用其中的函数。
|
||||
<input spellcheck=false onkeydown="inputDelete(event)" id="serviceDescription" name="serviceDescription" class="form-control"></input>
|
||||
<label>导出数据格式(Excel/CSV/TXT/数据库,<a href="https://www.bilibili.com/video/BV1os4y1679S/" target="_blank">查看MySQL操作教程</a>):</label>
|
||||
<select id="outputFormat" class="form-control">
|
||||
<option value = "xlsx">XLSX(即EXCEL文件,建议单个单元格长度超过500时使用CSV格式存储)</option>
|
||||
<option value = "csv">CSV(采集长文章推荐使用此格式)</option>
|
||||
<option value = "xlsx">XLSX(即EXCEL文件,建议单个单元格长度超过500时使用CSV格式存储)</option>
|
||||
<option value = "txt">TXT</option>
|
||||
<option value = "json">JSON</option>
|
||||
<option value = "mysql">MySQL数据库(大量数据推荐使用)</option>
|
||||
@ -712,6 +734,7 @@ print(emotlib.emoji()) # 使用其中的函数。
|
||||
<select id="dataWriteMode" name="dataWriteMode" class="form-control">
|
||||
<option value=1>追加写入(如果文件已存在则在原文件后面追加)</option>
|
||||
<option value=2>覆盖写入(如果文件已存在则覆盖原文件)</option>
|
||||
<option value=3>重命名写入(如果文件已存在则重命名文件)</option>
|
||||
</select>
|
||||
<!-- <label>是否为Cloudflare等极端反爬网站(<a href="https://www.bilibili.com/video/BV1Ph4y1E7R9/" target="_blank">查看Cloudflare设计和执行教程</a>):</label>-->
|
||||
<!-- <select id="cloudflare" name="cloudflare" class="form-control">-->
|
||||
|
BIN
ElectronJS/src/taskGrid/element-ui/fonts/element-icons.woff
Normal file
BIN
ElectronJS/src/taskGrid/element-ui/fonts/element-icons.woff
Normal file
Binary file not shown.
15678
ElectronJS/src/taskGrid/element-ui/index.css
Normal file
15678
ElectronJS/src/taskGrid/element-ui/index.css
Normal file
File diff suppressed because it is too large
Load Diff
1
ElectronJS/src/taskGrid/element-ui/index.js
Normal file
1
ElectronJS/src/taskGrid/element-ui/index.js
Normal file
File diff suppressed because one or more lines are too long
@ -31,18 +31,20 @@
|
||||
.ID {
|
||||
width: 10%;
|
||||
}
|
||||
.excel th,.excel td {
|
||||
|
||||
.excel th, .excel td {
|
||||
text-align: center;
|
||||
font-size: 13px;
|
||||
padding: 10px;
|
||||
max-width: 200px!important;
|
||||
max-width: 200px !important;
|
||||
}
|
||||
|
||||
.tip {
|
||||
position: fixed;
|
||||
width:100%;
|
||||
width: 100%;
|
||||
display: none;
|
||||
z-index: 1000;
|
||||
top:0;
|
||||
top: 0;
|
||||
}
|
||||
</style>
|
||||
</head>
|
||||
@ -59,31 +61,40 @@
|
||||
提示:任务执行ID对应配置文件已更新,您可使用任务ID:<span id="newID_ZH"></span>来执行加载了新配置的任务。
|
||||
</div>
|
||||
<div id="tipID_EN" class="alert alert-info alert-dismissible fade show tip">
|
||||
Hint: The task execution ID corresponds to the configuration file has been updated, you can use the task ID <span id="newID_EN"></span> to execute the task with the new configuration.</div>
|
||||
Hint: The task execution ID corresponds to the configuration file has been updated, you can use the task ID
|
||||
<span id="newID_EN"></span> to execute the task with the new configuration.
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="row" style="margin-top: 40px;">
|
||||
|
||||
<div class="col-md-7" id="taskInfo" style="margin:0 auto" v-if="show">
|
||||
<div id="tipCustom" class="alert alert-success alert-dismissible fade show" style="display: none; z-index: 1000">
|
||||
{{tip | lang}}</div>
|
||||
<div class="col-md-8" id="taskInfo" style="margin:0 auto" v-if="show">
|
||||
<div id="tipCustom" class="alert alert-success alert-dismissible fade show"
|
||||
style="display: none; z-index: 1000">
|
||||
{{tip | lang}}
|
||||
</div>
|
||||
|
||||
<div class="modal fade" id="myModal" tabindex="-1" role="dialog" aria-labelledby="myModalLabel" aria-hidden="true">
|
||||
<div class="modal fade" id="myModal" tabindex="-1" role="dialog" aria-labelledby="myModalLabel"
|
||||
aria-hidden="true">
|
||||
<div class="modal-dialog modal-lg">
|
||||
<div class="modal-content">
|
||||
<div class="modal-header">
|
||||
<h4 class="modal-title" id="myModalLabel">{{"Task Execution Instruction~执行任务说明" | lang}}</h4>
|
||||
<h4 class="modal-title"
|
||||
id="myModalLabel">{{"Task Execution Instruction~执行任务说明" | lang}}</h4>
|
||||
<button type="button" class="close" data-dismiss="modal" aria-hidden="true">×</button>
|
||||
</div>
|
||||
<div class="modal-body">
|
||||
<input onkeydown="inputDelete(event)" id="serviceId" type="hidden" name="serviceId" value="-1"></input>
|
||||
<input onkeydown="inputDelete(event)" id="url" type="hidden" name="url" value="about:blank"></input>
|
||||
<label><a href="https://github.com/NaiboWang/EasySpider/wiki/Argument-Instruction" target="_blank">{{`Click Here~点击这里` | lang}}</a> {{`Here to see argument instruction.~这里查看参数配置说明。` | lang}}</label>
|
||||
<label v-if="OS=='darwin'">{{`对于MacOS系统,EasySpider提供了两个不同的执行程序,分别为easyspider_executestage和easyspider_executestage_full,前者执行时加载速度较快,并提供了除OCR识别和数据去重以外的全部功能;后者则提供了包括OCR识别和数据去重在内的全部功能,但运行时加载速度较慢,需要等待2-10分钟才能执行程序,请根据自己的需求选择执行哪个程序。~For MacOS system, EasySpider provides two different execution programs, 'easyspider_executestage' and 'easyspider_executestage_full', the former loads faster when executing, and provides all functions except OCR recognition and data deduplication; the latter provides all functions including OCR recognition and data deduplication, but the loading speed is slower when running, and it takes 2-10 minutes to wait for the program to execute, please choose which program to execute according to your needs.` | lang}}</label>
|
||||
<label>{{ `Please open a terminal (For Windows, please use PowerShell instead of CMD), go to EasySpider's folder, and then copy (Command/Ctrl + c) the following command to run the task (EasySpider cannot quit when executing command, unless --read_type is set to "local"):~请在EasySpider目录下打开命令行工具Terminal (Windows请使用PowerShell而不是CMD),然后复制(Command/Ctrl + c)和运行以下命令以执行任务(执行命令时不能退出EasySpider,除非将--read_type设置为local):` | lang }}</label>
|
||||
<input onkeydown="inputDelete(event)" id="serviceId" type="hidden" name="serviceId"
|
||||
value="-1"></input>
|
||||
<input onkeydown="inputDelete(event)" id="url" type="hidden" name="url"
|
||||
value="about:blank"></input>
|
||||
<label><a href="https://github.com/NaiboWang/EasySpider/wiki/Argument-Instruction"
|
||||
target="_blank">{{`Click Here~点击这里` | lang}}</a> {{`Here to see argument instruction.~这里查看参数配置说明。` | lang}}</label>
|
||||
<label v-if="OS=='MacOS'">{{`对于MacOS系统,EasySpider提供了两个不同的执行程序,分别为easyspider_executestage和easyspider_executestage_full,前者执行时加载速度较快,并提供了除OCR识别和数据去重以外的全部功能;后者则提供了包括OCR识别和数据去重在内的全部功能,但运行时加载速度较慢,需要等待2-10分钟才能执行程序,请根据自己的需求选择执行哪个程序。~For MacOS system, EasySpider provides two different execution programs, 'easyspider_executestage' and 'easyspider_executestage_full', the former loads faster when executing, and provides all functions except OCR recognition and data deduplication; the latter provides all functions including OCR recognition and data deduplication, but the loading speed is slower when running, and it takes 2-10 minutes to wait for the program to execute, please choose which program to execute according to your needs.` | lang}}</label>
|
||||
<label>{{ `Please open a terminal (For Windows, please use PowerShell instead of CMD), go to EasySpider's folder, and then copy (Command/Ctrl + c) the following command to run the task (EasySpider can quit when executing command for ease of timed execution, and you can set --read_type to "remote" for remote execution):~请在EasySpider目录下打开命令行工具Terminal (Windows请使用PowerShell而不是CMD),然后复制(Command/Ctrl + c)和运行以下命令以执行任务(执行命令时可以退出EasySpider以方便定时执行,如需要远程调用则需要将--read_type设置为remote并设置远程地址):` | lang }}</label>
|
||||
<textarea class="form-control" style="height:150px">cd {{easyspider_location}}
|
||||
{{command}} --config_folder "{{config_folder}}" --headless 0 --read_type remote --config_file_name config.json --saved_file_name </textarea>
|
||||
{{command}} --config_folder "{{config_folder}}" --headless 0 --read_type local --config_file_name config.json --saved_file_name </textarea>
|
||||
</div>
|
||||
<!-- <div class="modal-footer">
|
||||
<button type="button" id="saveAsButton" class="btn btn-outline-primary">另存为</button>
|
||||
@ -94,34 +105,38 @@
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="modal fade" id="excelModal" tabindex="-1" role="dialog" aria-labelledby="excelModalLabel" aria-hidden="true">
|
||||
<div class="modal fade" id="excelModal" tabindex="-1" role="dialog" aria-labelledby="excelModalLabel"
|
||||
aria-hidden="true">
|
||||
<div class="modal-dialog modal-lg">
|
||||
<div class="modal-content">
|
||||
<div class="modal-header">
|
||||
<h4 class="modal-title" id="excelModalLabel">{{"Read from Excel~从Excel文件读取输入参数" | lang}}</h4>
|
||||
<h4 class="modal-title"
|
||||
id="excelModalLabel">{{"Read from Excel~从Excel文件读取输入参数" | lang}}</h4>
|
||||
<button type="button" class="close" data-dismiss="modal" aria-hidden="true">×</button>
|
||||
</div>
|
||||
<div class="modal-body">
|
||||
<!-- <form action="/upload" method="post" enctype="multipart/form-data">-->
|
||||
<!-- <form action="/upload" method="post" enctype="multipart/form-data">-->
|
||||
<div>
|
||||
<div class="form-group" style="margin-bottom: 10px">
|
||||
<label>{{"Please select an Excel file (.xlsx)~请选择一个Excel文件(.xlsx)" | lang}}</label>
|
||||
<input type="file" class="form-control-file" id="excelFile" name="file">
|
||||
<label style="display: block; margin-top:10px;margin-bottom: 0">{{fileUploadStatus | lang}}</label>
|
||||
</div>
|
||||
<div class="form-group" style="margin-bottom: 10px">
|
||||
<label>{{"Please select an Excel file (.xlsx)~请选择一个Excel文件(.xlsx)" | lang}}</label>
|
||||
<input type="file" class="form-control-file" id="excelFile" name="file">
|
||||
<label style="display: block; margin-top:10px;margin-bottom: 0">{{fileUploadStatus | lang}}</label>
|
||||
</div>
|
||||
|
||||
<button @click="submitFile" class="btn btn-primary" style="min-width: 100px;margin-bottom:10px">{{"Upload~上传" | lang}}</button>
|
||||
<button @click="submitFile" class="btn btn-primary"
|
||||
style="min-width: 100px;margin-bottom:10px">{{"Upload~上传" | lang}}
|
||||
</button>
|
||||
|
||||
<!-- </form>-->
|
||||
<!-- </form>-->
|
||||
</div>
|
||||
<label style="margin:10px 0">{{"Please design an Excel file (.xlsx) according to the following format and upload it above.~请按照以下格式设计一个Excel文件(.xlsx),并在上方上传:" | lang}}</label>
|
||||
<label style="margin:10px 0">{{"Please design an Excel file (.xlsx) according to the following format and upload it above.~请按照以下格式设计一个Excel文件(.xlsx),并在上方上传:" | lang}}</label>
|
||||
<table class="table table-bordered excel" style="text-align: center; font-size: 13px">
|
||||
<thead>
|
||||
<tr>
|
||||
<th scope="col">{{"Invoke Name 1~调用名称1" | lang}}</th>
|
||||
<th scope="col">{{"Invoke Name 2~调用名称2" | lang}}</th>
|
||||
<th scope="col">...</th>
|
||||
</tr>
|
||||
<tr>
|
||||
<th scope="col">{{"Invoke Name 1~调用名称1" | lang}}</th>
|
||||
<th scope="col">{{"Invoke Name 2~调用名称2" | lang}}</th>
|
||||
<th scope="col">...</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
<tr>
|
||||
@ -145,15 +160,17 @@
|
||||
<label>{{"You can just put part of the arguments in the Excel file, and the values of the rest of the arguments will be set to default. Example table for this task is:~您可以只在Excel文件中放入部分参数,其余参数值将被设置为默认值。一个针对此任务的表格示例为:" | lang}}</label>
|
||||
<table class="table table-bordered excel">
|
||||
<thead>
|
||||
<tr>
|
||||
<th v-for="i in Math.min(3, task.inputParameters.length)" v-if="task.inputParameters.length>0">
|
||||
{{task.inputParameters[i-1]["name"]}}
|
||||
</th>
|
||||
</tr>
|
||||
<tr>
|
||||
<th v-for="i in Math.min(3, task.inputParameters.length)"
|
||||
v-if="task.inputParameters.length>0">
|
||||
{{task.inputParameters[i-1]["name"]}}
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
<tr>
|
||||
<td v-for="i in Math.min(3, task.inputParameters.length)" v-if="task.inputParameters.length>0">
|
||||
<td v-for="i in Math.min(3, task.inputParameters.length)"
|
||||
v-if="task.inputParameters.length>0">
|
||||
{{getLine(i,0)}}
|
||||
</td>
|
||||
<tr>
|
||||
@ -171,8 +188,10 @@
|
||||
<tr>
|
||||
</tbody>
|
||||
</table>
|
||||
<label v-if='lang == "zh"' style="width: 95%">对于循环输入文字的参数(loopText)需要配置索引值的情况,即输入文字操作用了相对循环内的索引值,您可以在Excel文件中写同一个参数名写多列,程序将自动合并。 例如,想要设置'loopText_1'参数值两行,分别为"A~B~C"和"D~E~F",则Excel文件可以这样设置:</label>
|
||||
<label v-else style="width: 95%"> For parameters that need to configure the index value of the loop text (loopText), that is, the input text operation uses the index value relative to the loop, you can write multiple columns with the same parameter name in the Excel file, and the program will automatically merge. For example, if you want to set the parameter value of 'loopText_1' to two rows, which are "A~B~C" and "D~E~F", the Excel file can be set like this:</label>
|
||||
<label v-if='lang == "zh"'
|
||||
style="width: 95%">对于循环输入文字的参数(loopText)需要配置索引值的情况,即输入文字操作用了相对循环内的索引值,您可以在Excel文件中写同一个参数名写多列,程序将自动合并。 例如,想要设置'loopText_1'参数值两行,分别为"A~B~C"和"D~E~F",则Excel文件可以这样设置:</label>
|
||||
<label v-else
|
||||
style="width: 95%"> For parameters that need to configure the index value of the loop text (loopText), that is, the input text operation uses the index value relative to the loop, you can write multiple columns with the same parameter name in the Excel file, and the program will automatically merge. For example, if you want to set the parameter value of 'loopText_1' to two rows, which are "A~B~C" and "D~E~F", the Excel file can be set like this:</label>
|
||||
<table class="table table-bordered excel" style="text-align: center; font-size: 13px">
|
||||
<thead>
|
||||
<tr>
|
||||
@ -204,7 +223,8 @@
|
||||
<nav aria-label="breadcrumb">
|
||||
<ol class="breadcrumb" style="padding-left:0;background-color: white">
|
||||
<li @click="gotoHome" class="breadcrumb-item"><a href="#">{{"Home~首页" | lang}}</a></li>
|
||||
<li @click="gotoInfo" aria-current="page" class="breadcrumb-item" style="color: black"><a href="#">{{"Task Information~任务信息" | lang}}</a></li>
|
||||
<li @click="gotoInfo" aria-current="page" class="breadcrumb-item" style="color: black"><a
|
||||
href="#">{{"Task Information~任务信息" | lang}}</a></li>
|
||||
<li aria-current="page" class="breadcrumb-item active" style="color: black">{{"Task Execution~任务执行"
|
||||
| lang}}
|
||||
</li>
|
||||
@ -215,10 +235,15 @@
|
||||
<p style="word-wrap: break-word;word-break: break-all;overflow: hidden;max-height: 100px;">{{"Task Description:~任务描述:" | lang}} {{task["desc"]}}</p>
|
||||
<p style="word-wrap: break-word;word-break: break-all;overflow: hidden;max-height: 100px;">{{"API URL (POST):~API 调用网址(POST):" |
|
||||
lang}} {{backEndAddressServiceWrapper}}/invokeTask?id={{task["id"]}}</p>
|
||||
<p style="word-wrap: break-word;word-break: break-all;overflow: hidden;max-height: 100px;">{{"URL of how to invoke task by API via POST request (Postman or JavaScript): ~通过POST方式进行API调用的示例教程(Postman或JS代码):" | lang}}<a target="_blank" href="https://github.com/NaiboWang/EasySpider/wiki/API-Invoke-Example">https://github.com/NaiboWang/EasySpider/wiki/API-Invoke-Example</a></p>
|
||||
<p><button class="btn btn-primary" @click="readFromExcel">{{"Read parameters from Excel file~从Excel文件读取输入参数"
|
||||
| lang}}
|
||||
</button></p>
|
||||
<p style="word-wrap: break-word;word-break: break-all;overflow: hidden;max-height: 100px;">{{"URL of how to invoke task by API via POST request (Postman or JavaScript): ~通过POST方式进行API调用的示例教程(Postman或JS代码):" | lang}}<a
|
||||
target="_blank"
|
||||
href="https://github.com/NaiboWang/EasySpider/wiki/API-Invoke-Example">https://github.com/NaiboWang/EasySpider/wiki/API-Invoke-Example</a>
|
||||
</p>
|
||||
<p>
|
||||
<button class="btn btn-primary" @click="readFromExcel">{{"Read parameters from Excel file~从Excel文件读取输入参数"
|
||||
| lang}}
|
||||
</button>
|
||||
</p>
|
||||
<p>{{"Please Input Parameters:~请输入参数值:" | lang}}</p>
|
||||
<form id="form">
|
||||
<table class="table table-bordered">
|
||||
@ -237,7 +262,8 @@
|
||||
<td style="text-align: center; max-width: 250px;white-space: initial">{{task.inputParameters[i-1]["name"]}}</td>
|
||||
<td style="max-width: 100px; text-align: center">{{task.inputParameters[i-1]["type"]}}</td>
|
||||
<td><textarea class="form-control"
|
||||
style="min-height: 50px;min-width: 300px;" v-bind:name="task.inputParameters[i-1]['name']"
|
||||
style="min-height: 50px;min-width: 300px;"
|
||||
v-bind:name="task.inputParameters[i-1]['name']"
|
||||
v-model="task.inputParameters[i-1]['value']"></textarea></td>
|
||||
</tr>
|
||||
</tbody>
|
||||
@ -255,10 +281,12 @@
|
||||
</div>
|
||||
</form>
|
||||
<label style="display: block">{{"Click the button below to execute the task. Long press the pause button (default: p) on the keyboard to pause the task. Manual intervention is possible during the task execution process, ~点击以下按钮执行任务,任务执行过程中可以长按暂停键(默认:p键)暂停任务的执行以便" | lang }}<b>{{"~人工干预," | lang}}</b>{{"such as manually input a password or captcha: ~如手动输入密码,验证码等。" | lang}}</label>
|
||||
<button class="btn btn-primary" v-on:click="localExecuteInstant(false)">{{"Directly Run Locally (Clean Mode)~本地直接执行(纯净模式)" |
|
||||
<button class="btn btn-primary"
|
||||
v-on:click="localExecuteInstant(false)">{{"Directly Run Locally (Clean Mode)~本地直接执行(纯净模式)" |
|
||||
lang}}
|
||||
</button>
|
||||
<button class="btn btn-primary" v-on:click="localExecuteInstant(true)">{{"Directly Run Locally (Data Mode)~本地直接执行(带用户信息模式)" |
|
||||
<button class="btn btn-primary"
|
||||
v-on:click="localExecuteInstant(true)">{{"Directly Run Locally (Data Mode)~本地直接执行(带用户信息模式)" |
|
||||
lang}}
|
||||
</button>
|
||||
<!-- <button style="margin-left: 5px;" v-on:click="remoteExcuteInstant" class="btn btn-primary">Directly Run Remotely</button> -->
|
||||
@ -267,16 +295,21 @@
|
||||
</label>
|
||||
<div style="margin-bottom: 10px;">
|
||||
<label style="margin-top: 10px;">{{"Execution ID (EID), execution files are stored in 'execution_instances' folder, you can write EID by yourself and the set the filename other than 'current_time to append content to the existing file from the EID to achieve incremental collection:~执行ID(执行文件存放在execution_instances文件夹内,提前在下方写好执行ID且文件名不为current_time时可以追加文件内容以实现增量采集):" | lang}}</label>
|
||||
<input class="form-control" v-model="ID" :placeholder="LANG('如果已在此处写/生成了ID号,则点击执行或获得ID按钮后,任务ID将保持不变且原任务文件将会被新配置覆盖','If already have ID here, the task ID will remain unchanged and the original task file will be overwritten by the new configuration after click buttons')"></input>
|
||||
<input class="form-control" v-model="ID"
|
||||
:placeholder="LANG('如果已在此处写/生成了ID号,则点击执行或获得ID按钮后,任务ID将保持不变且原任务文件将会被新配置覆盖','If already have ID here, the task ID will remain unchanged and the original task file will be overwritten by the new configuration after click buttons')"></input>
|
||||
<p></p>
|
||||
<!-- <p>提示:点击下方按钮获得任务ID,然后根据此ID进行服务执行;也可自己POST调用接口得到ID,具体参照POST调用文档。</p> -->
|
||||
<p>{{"Hint: Click the \"Get Execution ID\" button at the bottom to get the task ID, and click the \"Execute task by commandline\" button at the back to get the prompt command on how to run this task using the command line.~提示:点击下方“获得任务执行ID”按钮得到任务ID,点击后面的“使用命令行执行任务”按钮获得如何使用命令行运行任务的提示命令。" | lang}}</p>
|
||||
<button class="btn btn-primary" href="javascript:void(0)" v-on:click="invokeTask">{{"Get Execution ID~获得任务执行ID" |
|
||||
lang}}</button>
|
||||
<button class="btn btn-primary" style="margin-left: 8px;" v-on:click="localExecute(false)">{{"Execute task by commandline (Clean Mode)~使用命令行执行任务(纯净模式)"
|
||||
<button class="btn btn-primary" href="javascript:void(0)"
|
||||
v-on:click="invokeTask">{{"Get Execution ID~获得任务执行ID" |
|
||||
lang}}
|
||||
</button>
|
||||
<button class="btn btn-primary" style="margin-left: 8px;"
|
||||
v-on:click="localExecute(false)">{{"Execute task by commandline (Clean Mode)~使用命令行执行任务(纯净模式)"
|
||||
| lang}}
|
||||
</button>
|
||||
<button class="btn btn-primary" style="margin-left: 8px;" v-on:click="localExecute(true)">{{"Execute task by commandline (Data Mode)~使用命令行执行任务(带用户信息模式)"
|
||||
<button class="btn btn-primary" style="margin-left: 8px;"
|
||||
v-on:click="localExecute(true)">{{"Execute task by commandline (Data Mode)~使用命令行执行任务(带用户信息模式)"
|
||||
| lang}}
|
||||
</button>
|
||||
<!-- <button v-on:click="remoteExcute" style="margin-left: 8px;" class="btn btn-primary">Run remotely</button></div> -->
|
||||
@ -290,14 +323,14 @@
|
||||
</html>
|
||||
|
||||
<style>
|
||||
button{
|
||||
button {
|
||||
margin-top: 5px;
|
||||
}
|
||||
</style>
|
||||
<script src="global.js"></script>
|
||||
<script>
|
||||
var sId = getUrlParam('id');
|
||||
var app = new Vue({
|
||||
let sId = getUrlParam('id');
|
||||
let app = new Vue({
|
||||
el: '#taskInfo',
|
||||
data: {
|
||||
task: {},
|
||||
@ -306,8 +339,8 @@
|
||||
lang: getUrlParam("lang"),
|
||||
type: getUrlParam("type"),
|
||||
tip: "The parameter values in the Excel file have been successfully imported into the corresponding field text box~Excel文件中的参数值已成功导入到对应字段文本框中",
|
||||
file:null,
|
||||
user_data_folder:"",
|
||||
file: null,
|
||||
user_data_folder: "",
|
||||
fileUploadStatus: "Status: Waiting for upload~状态:等待上传",
|
||||
with_user_data: true,
|
||||
backEndAddressServiceWrapper: getUrlParam("backEndAddressServiceWrapper"),
|
||||
@ -315,18 +348,18 @@
|
||||
config_folder: "",
|
||||
easyspider_location: "",
|
||||
mysql_config_path: "",
|
||||
OS: "win32",
|
||||
OS: "Windows",
|
||||
}, mounted() {
|
||||
$.get(this.backEndAddressServiceWrapper + "/getConfig", function (result) {
|
||||
app.$data.user_data_folder = result.user_data_folder;
|
||||
try{
|
||||
app.$data.mysql_config_path = result.mysql_config_path;
|
||||
} catch (e) {
|
||||
app.$data.mysql_config_path = "./mysql_config.json";
|
||||
}
|
||||
});
|
||||
//TODO 翻译 写清楚readme有关浏览器版本的问题 logo更换 提示看文档来运行
|
||||
},
|
||||
$.get(this.backEndAddressServiceWrapper + "/getConfig", function (result) {
|
||||
app.$data.user_data_folder = result.user_data_folder;
|
||||
try {
|
||||
app.$data.mysql_config_path = result.mysql_config_path;
|
||||
} catch (e) {
|
||||
app.$data.mysql_config_path = "./mysql_config.json";
|
||||
}
|
||||
});
|
||||
//TODO 翻译 写清楚readme有关浏览器版本的问题 logo更换 提示看文档来运行
|
||||
},
|
||||
methods: {
|
||||
LANG: function (zh, en) {
|
||||
if (this.lang == "zh") {
|
||||
@ -338,40 +371,40 @@
|
||||
gotoHome: function () {
|
||||
let url = "";
|
||||
if (getUrlParam("lang") == "zh") {
|
||||
url = "taskList.html?lang=zh&type="+getUrlParam("type")+"&wsport="+getUrlParam("wsport")+"&backEndAddressServiceWrapper=" + app.$data.backEndAddressServiceWrapper
|
||||
url = "taskList.html?lang=zh&type=" + getUrlParam("type") + "&wsport=" + getUrlParam("wsport") + "&backEndAddressServiceWrapper=" + app.$data.backEndAddressServiceWrapper
|
||||
} else {
|
||||
url = "taskList.html?lang=en&type="+getUrlParam("type")+"&wsport="+getUrlParam("wsport")+"&backEndAddressServiceWrapper=" + app.$data.backEndAddressServiceWrapper
|
||||
url = "taskList.html?lang=en&type=" + getUrlParam("type") + "&wsport=" + getUrlParam("wsport") + "&backEndAddressServiceWrapper=" + app.$data.backEndAddressServiceWrapper
|
||||
}
|
||||
window.location.href = url;
|
||||
}, gotoInfo: function () {
|
||||
let url = "";
|
||||
if (getUrlParam("lang") == "zh") {
|
||||
url = "taskInfo.html?id="+getUrlParam("id")+"&lang=zh&type="+getUrlParam("type")+"&wsport="+getUrlParam("wsport")+"&backEndAddressServiceWrapper=" + app.$data.backEndAddressServiceWrapper
|
||||
url = "taskInfo.html?id=" + getUrlParam("id") + "&lang=zh&type=" + getUrlParam("type") + "&wsport=" + getUrlParam("wsport") + "&backEndAddressServiceWrapper=" + app.$data.backEndAddressServiceWrapper
|
||||
} else {
|
||||
url = "taskInfo.html?id="+getUrlParam("id")+"&lang=en&type="+getUrlParam("type")+"&wsport="+getUrlParam("wsport")+"&backEndAddressServiceWrapper=" + app.$data.backEndAddressServiceWrapper
|
||||
url = "taskInfo.html?id=" + getUrlParam("id") + "&lang=en&type=" + getUrlParam("type") + "&wsport=" + getUrlParam("wsport") + "&backEndAddressServiceWrapper=" + app.$data.backEndAddressServiceWrapper
|
||||
}
|
||||
window.location.href = url;
|
||||
}, getLine: function(i, index){
|
||||
const value = this.task.inputParameters[i-1]["value"].toString();
|
||||
}, getLine: function (i, index) {
|
||||
const value = this.task.inputParameters[i - 1]["value"].toString();
|
||||
const parts = value.split("\n");
|
||||
if(parts.length > index){
|
||||
if (parts.length > index) {
|
||||
return parts[index];
|
||||
} else if(this.task.inputParameters[i-1]["name"].indexOf("url") >=0){
|
||||
} else if (this.task.inputParameters[i - 1]["name"].indexOf("url") >= 0) {
|
||||
return parts[0];
|
||||
} else {
|
||||
return "";
|
||||
}
|
||||
},
|
||||
readFromExcel: function(){
|
||||
readFromExcel: function () {
|
||||
$('#excelModal').modal('show');
|
||||
}, submitFile: function() {
|
||||
}, submitFile: function () {
|
||||
let form_data = new FormData();
|
||||
this.file = $('#excelFile').prop('files')[0];
|
||||
if(this.file == null || $('#excelFile').val() == ""){
|
||||
if (this.file == null || $('#excelFile').val() == "") {
|
||||
this.fileUploadStatus = "Status: Please select a file~状态:请选择文件";
|
||||
return;
|
||||
}
|
||||
if (this.file.name.split('.').pop() !== 'xlsx' ) {
|
||||
if (this.file.name.split('.').pop() !== 'xlsx') {
|
||||
this.fileUploadStatus = "Status: Only xlsx files are allowed!~状态:只允许上传xlsx文件!";
|
||||
return;
|
||||
}
|
||||
@ -379,25 +412,25 @@
|
||||
form_data.append('file', $('#excelFile').prop('files')[0]);
|
||||
// console.log(app.$data.backEndAddressServiceWrapper + "/excelUpload",)
|
||||
$.ajax({
|
||||
url: app.$data.backEndAddressServiceWrapper.replace("8074","8075") + "/excelUpload",
|
||||
url: "http://localhost:8075/excelUpload",
|
||||
type: 'POST',
|
||||
data: form_data,
|
||||
processData: false,
|
||||
contentType: false,
|
||||
success: function(response) {
|
||||
success: function (response) {
|
||||
response = JSON.parse(response);
|
||||
$('#excelModal').modal('hide');
|
||||
app.$data.fileUploadStatus = "Status: Upload successfully~状态:上传成功";
|
||||
$('#excelFile').val("");
|
||||
let inputParameters = app.$data.task.inputParameters;
|
||||
inputParameters.forEach(function (item, index) {
|
||||
if(Object.keys(response).includes(item.name)){
|
||||
if (Object.keys(response).includes(item.name)) {
|
||||
let temp = "";
|
||||
let tempArray = [];
|
||||
for (let i = 0; i < response[item.name].length; i++) {
|
||||
for(let key of Object.keys(response)){
|
||||
if(key.includes(item.name)){
|
||||
temp += response[key][i] == undefined? "": response[key][i] + "~";
|
||||
for (let key of Object.keys(response)) {
|
||||
if (key.includes(item.name)) {
|
||||
temp += response[key][i] == undefined ? "" : response[key][i] + "~";
|
||||
}
|
||||
}
|
||||
temp = temp.substring(0, temp.length - 1); //去掉最后一个~
|
||||
@ -408,11 +441,11 @@
|
||||
}
|
||||
});
|
||||
$("#tipCustom").slideDown(); //提示框
|
||||
setTimeout(function() {
|
||||
setTimeout(function () {
|
||||
$("#tipCustom").slideUp();
|
||||
}, 3000);
|
||||
},
|
||||
error: function(err) {
|
||||
error: function (err) {
|
||||
app.$data.fileUploadStatus = "Status: Upload failed~状态:上传失败";
|
||||
}
|
||||
});
|
||||
@ -436,17 +469,17 @@
|
||||
params: JSON.stringify(param)
|
||||
}
|
||||
$.post(app.$data.backEndAddressServiceWrapper + "/invokeTask", message, function (result) {
|
||||
if(app.$data.ID == result){
|
||||
if (app.$data.ID == result) {
|
||||
if (getUrlParam("lang") == "en" || getUrlParam("lang") == "") {
|
||||
$("#tipID_EN").slideDown(); //提示框
|
||||
$("#newID_EN").text(result);
|
||||
setTimeout(function() {
|
||||
setTimeout(function () {
|
||||
$("#tipID_EN").slideUp();
|
||||
}, 5000);
|
||||
} else {
|
||||
$("#tipID").slideDown(); //提示框
|
||||
$("#newID_ZH").text(result);
|
||||
setTimeout(function() {
|
||||
setTimeout(function () {
|
||||
$("#tipID").slideUp();
|
||||
}, 5000);
|
||||
}
|
||||
@ -455,16 +488,16 @@
|
||||
});
|
||||
// }
|
||||
},
|
||||
localExecute: function (with_user_data=false) {
|
||||
localExecute: function (with_user_data = false) {
|
||||
if (this.ID === "") {
|
||||
if (getUrlParam("lang") == "en" || getUrlParam("lang") == "") {
|
||||
$("#tipEN").slideDown(); //提示框
|
||||
setTimeout(function() {
|
||||
setTimeout(function () {
|
||||
$("#tipEN").slideUp();
|
||||
}, 3000);
|
||||
} else {
|
||||
$("#tip").slideDown(); //提示框
|
||||
setTimeout(function() {
|
||||
setTimeout(function () {
|
||||
$("#tip").slideUp();
|
||||
}, 3000);
|
||||
}
|
||||
@ -478,24 +511,24 @@
|
||||
// text = "确定要在本地运行此任务吗?";
|
||||
// }
|
||||
// if (confirm(text)) {
|
||||
let message = { //显示flowchart
|
||||
type: 5, //消息类型,调用执行程序
|
||||
message: {
|
||||
"id": app.$data.ID,
|
||||
"user_data_folder": app.$data.with_user_data ? app.$data.user_data_folder : "",
|
||||
"mysql_config_path": app.$data.mysql_config_path,
|
||||
"execute_type": 0,
|
||||
}
|
||||
};
|
||||
ws.send(JSON.stringify(message));
|
||||
changeCommand();
|
||||
let message = { //显示flowchart
|
||||
type: 5, //消息类型,调用执行程序
|
||||
message: {
|
||||
"id": app.$data.ID,
|
||||
"user_data_folder": app.$data.with_user_data ? app.$data.user_data_folder : "",
|
||||
"mysql_config_path": app.$data.mysql_config_path,
|
||||
"execute_type": 0,
|
||||
}
|
||||
};
|
||||
ws.send(JSON.stringify(message));
|
||||
changeCommand();
|
||||
$('#myModal').modal('show');
|
||||
// }
|
||||
},
|
||||
remoteExecute: function () {
|
||||
|
||||
},
|
||||
localExecuteInstant: function (with_user_data=false) {
|
||||
localExecuteInstant: function (with_user_data = false) {
|
||||
let text = "";
|
||||
// if (getUrlParam("lang") == "en" || getUrlParam("lang") == "") {
|
||||
// text = "Are you sure to run this task locally now?";
|
||||
@ -505,34 +538,36 @@
|
||||
|
||||
this.with_user_data = with_user_data;
|
||||
// if (confirm(text)) {
|
||||
let param = {};
|
||||
let t = $('#form').serializeArray();
|
||||
t.forEach(function (item, index) {
|
||||
param[item.name] = item.value;
|
||||
});
|
||||
$.post(app.$data.backEndAddressServiceWrapper + "/invokeTask", {
|
||||
id: this.task.id,
|
||||
EID: this.ID,
|
||||
params: JSON.stringify(param)
|
||||
}, function (result) {
|
||||
let message = { //显示flowchart
|
||||
type: 5, //消息类型,调用执行程序
|
||||
message: {
|
||||
"id": result,
|
||||
"user_data_folder": app.$data.with_user_data ? app.$data.user_data_folder : "",
|
||||
"mysql_config_path": app.$data.mysql_config_path,
|
||||
"execute_type": 1,
|
||||
}
|
||||
};
|
||||
app.$data.ID = result;
|
||||
ws.send(JSON.stringify(message));
|
||||
$.get(app.$data.backEndAddressServiceWrapper + "/queryOSVersion", function (OSInfo) {
|
||||
if(OSInfo.version == 'darwin'){
|
||||
changeCommand();
|
||||
$('#myModal').modal('show');
|
||||
}
|
||||
});
|
||||
});
|
||||
let param = {};
|
||||
let t = $('#form').serializeArray();
|
||||
t.forEach(function (item, index) {
|
||||
param[item.name] = item.value;
|
||||
});
|
||||
$.post(app.$data.backEndAddressServiceWrapper + "/invokeTask", {
|
||||
id: this.task.id,
|
||||
EID: this.ID,
|
||||
params: JSON.stringify(param)
|
||||
}, function (result) {
|
||||
let message = { //显示flowchart
|
||||
type: 5, //消息类型,调用执行程序
|
||||
message: {
|
||||
"id": result,
|
||||
"user_data_folder": app.$data.with_user_data ? app.$data.user_data_folder : "",
|
||||
"mysql_config_path": app.$data.mysql_config_path,
|
||||
"execute_type": 1,
|
||||
}
|
||||
};
|
||||
app.$data.ID = result;
|
||||
ws.send(JSON.stringify(message));
|
||||
// 使用函数并打印结果
|
||||
const systemInfo = detectOperatingSystemAndArch();
|
||||
// $.get(app.$data.backEndAddressServiceWrapper + "/queryOSVersion", function (OSInfo) {
|
||||
if (systemInfo.OS == 'MacOS') {
|
||||
changeCommand();
|
||||
$('#myModal').modal('show');
|
||||
}
|
||||
// });
|
||||
});
|
||||
// }
|
||||
},
|
||||
remoteExecuteInstant: function () {
|
||||
@ -541,23 +576,25 @@
|
||||
});
|
||||
|
||||
function changeCommand() {
|
||||
$.get(app.$data.backEndAddressServiceWrapper + "/queryOSVersion", function (OSInfo) {
|
||||
app.$data.OS = OSInfo.version;
|
||||
if(OSInfo.version == 'win32' && OSInfo.bit == 'x64'){
|
||||
// $.get(app.$data.backEndAddressServiceWrapper + "/queryOSVersion", function (OSInfo) {
|
||||
// app.$data.OS = systemInfo.OS;
|
||||
const systemInfo = detectOperatingSystemAndArch();
|
||||
app.$data.OS = systemInfo.OS;
|
||||
if (systemInfo.OS == 'Windows' && systemInfo.architecture == 'x64') {
|
||||
app.$data.command = "./EasySpider/resources/app/chrome_win64/easyspider_executestage.exe --ids [" + app.$data.ID.toString() + "] --user_data " + (app.$data.with_user_data ? "1" : "0") + " --server_address " + app.$data.backEndAddressServiceWrapper;
|
||||
} else if(OSInfo.version == 'win32' && OSInfo.bit == 'ia32'){
|
||||
} else if (systemInfo.OS == 'Windows' && systemInfo.architecture == 'ia32') {
|
||||
app.$data.command = "./EasySpider/resources/app/chrome_win32/easyspider_executestage.exe --ids [" + app.$data.ID.toString() + "] --user_data " + (app.$data.with_user_data ? "1" : "0") + " --server_address " + app.$data.backEndAddressServiceWrapper;
|
||||
} else if(OSInfo.version == 'linux'){
|
||||
} else if (systemInfo.OS == 'Linux') {
|
||||
app.$data.command = "./EasySpider/resources/app/chrome_linux64/easyspider_executestage --ids '[" + app.$data.ID.toString() + "]' --user_data " + (app.$data.with_user_data ? "1" : "0") + " --server_address " + app.$data.backEndAddressServiceWrapper;
|
||||
} else if(OSInfo.version == 'darwin'){
|
||||
if(getUrlParam("lang") == "zh"){
|
||||
app.$data.easyspider_location = "你的EasySpider文件夹,如:cd /Users/"+ app.$data.config_folder.split("/")[2] + "/Downloads/EasySpider_MacOS";
|
||||
} else if (systemInfo.OS == 'MacOS') {
|
||||
if (getUrlParam("lang") == "zh") {
|
||||
app.$data.easyspider_location = "你的EasySpider文件夹,如:cd /Users/" + app.$data.config_folder.split("/")[2] + "/Downloads/EasySpider_MacOS";
|
||||
} else {
|
||||
app.$data.easyspider_location = "Your EasySpider folder, such as: cd /Users/"+ app.$data.config_folder.split("/")[2] + "/Downloads/EasySpider_MacOS";
|
||||
app.$data.easyspider_location = "Your EasySpider folder, such as: cd /Users/" + app.$data.config_folder.split("/")[2] + "/Downloads/EasySpider_MacOS";
|
||||
}
|
||||
app.$data.command = "./easyspider_executestage --ids '[" + app.$data.ID.toString() + "]' --user_data " + (app.$data.with_user_data ? "1" : "0") + " --server_address " + app.$data.backEndAddressServiceWrapper;
|
||||
}
|
||||
});
|
||||
// });
|
||||
}
|
||||
|
||||
$.get(app.$data.backEndAddressServiceWrapper + "/queryTask?id=" + sId, function (result) {
|
||||
@ -565,7 +602,7 @@
|
||||
app.$data.show = true;
|
||||
});
|
||||
|
||||
ws = new WebSocket("ws://localhost:"+getUrlParam("wsport"));
|
||||
ws = new WebSocket("ws://localhost:" + getUrlParam("wsport"));
|
||||
ws.onopen = function () {
|
||||
// Web Socket 已连接上,使用 send() 方法发送数据
|
||||
console.log("Connected");
|
||||
@ -587,10 +624,10 @@
|
||||
};
|
||||
this.send(JSON.stringify(message));
|
||||
};
|
||||
ws.onmessage = function(message){
|
||||
ws.onmessage = function (message) {
|
||||
message = JSON.parse(message.data);
|
||||
app.$data.config_folder = message.config_folder.replaceAll("\\","/");
|
||||
app.$data.easyspider_location = message.easyspider_location.replace("/EasySpider.app/","");
|
||||
app.$data.config_folder = message.config_folder.replaceAll("\\", "/");
|
||||
app.$data.easyspider_location = message.easyspider_location.replace("/EasySpider.app/", "");
|
||||
}
|
||||
ws.onclose = function () {
|
||||
// 关闭 websocket
|
||||
|
@ -23,14 +23,14 @@ function DateFormat(datetime) {
|
||||
|
||||
function formatDateTime(date) {
|
||||
const addZero = (num) => (num < 10 ? `0${num}` : num);
|
||||
|
||||
|
||||
let year = date.getFullYear();
|
||||
let month = addZero(date.getMonth() + 1); // getMonth() 返回值范围是0-11,所以加1
|
||||
let day = addZero(date.getDate());
|
||||
let hours = addZero(date.getHours());
|
||||
let minutes = addZero(date.getMinutes());
|
||||
let seconds = addZero(date.getSeconds());
|
||||
|
||||
|
||||
return `${year}-${month}-${day} ${hours}:${minutes}:${seconds}`;
|
||||
}
|
||||
|
||||
@ -57,7 +57,7 @@ function detectLang(str) {
|
||||
|
||||
if (enCount === cnCount) {
|
||||
return 2;
|
||||
} else if (cnCount>=3) {
|
||||
} else if (cnCount >= 3) {
|
||||
return 1;
|
||||
}
|
||||
return 0;
|
||||
@ -82,15 +82,23 @@ Vue.filter('lang', function (value) {
|
||||
}
|
||||
})
|
||||
|
||||
function LANG(zh, en) {
|
||||
if (window.location.href.indexOf("_CN") != -1) {
|
||||
return zh;
|
||||
} else {
|
||||
return en;
|
||||
}
|
||||
}
|
||||
|
||||
function isValidMySQLTableName(tableName) {
|
||||
// 正则表达式:以字母或汉字开头,后接字母、数字、下划线或汉字的字符串,长度为1到64字符
|
||||
const pattern = /^[\u4e00-\u9fa5a-zA-Z][\u4e00-\u9fa5a-zA-Z0-9_]{0,63}$/;
|
||||
return pattern.test(tableName);
|
||||
}
|
||||
|
||||
document.onkeydown = function(e) {
|
||||
document.onkeydown = function (e) {
|
||||
let t = false;
|
||||
try{
|
||||
try {
|
||||
t = nowNode;
|
||||
} catch (e) {
|
||||
console.log(e);
|
||||
@ -109,8 +117,8 @@ document.onkeydown = function(e) {
|
||||
location.reload();
|
||||
} else if (currKey == 123) {
|
||||
console.log("打开devtools")
|
||||
let command = new WebSocket("ws://localhost:"+getUrlParam("wsport"))
|
||||
command.onopen = function() {
|
||||
let command = new WebSocket("ws://localhost:" + getUrlParam("wsport"))
|
||||
command.onopen = function () {
|
||||
let message = {
|
||||
type: 6, //消息类型,0代表连接操作
|
||||
};
|
||||
@ -119,3 +127,27 @@ document.onkeydown = function(e) {
|
||||
}
|
||||
}
|
||||
}
|
||||
function detectOperatingSystemAndArch() {
|
||||
const platform = navigator.platform.toLowerCase();
|
||||
const userAgent = navigator.userAgent.toLowerCase();
|
||||
let OS = 'Unknown';
|
||||
let architecture = 'Unknown';
|
||||
|
||||
// 判断操作系统类型
|
||||
if (platform.includes('win')) {
|
||||
OS = 'Windows';
|
||||
} else if (platform.includes('mac')) {
|
||||
OS = 'MacOS';
|
||||
} else if (platform.includes('linux')) {
|
||||
OS = 'Linux';
|
||||
}
|
||||
|
||||
// 判断操作系统位数
|
||||
if (userAgent.includes('wow64') || userAgent.includes('win64') || platform.includes('x86_64') || platform.includes('amd64')) {
|
||||
architecture = 'x64';
|
||||
} else {
|
||||
architecture = 'ia32';
|
||||
}
|
||||
|
||||
return { OS, architecture };
|
||||
}
|
||||
|
@ -83,6 +83,18 @@ function changeGetDataParameters(msg, i) {
|
||||
msg["parameters"][i]["afterJSWaitTime"] = 0; //执行后js等待时间
|
||||
msg["parameters"][i]["downloadPic"] = 0; //是否下载图片
|
||||
msg["parameters"][i]["splitLine"] = 0; //是否分割行
|
||||
try {
|
||||
let exampleValue = msg["parameters"][i]["exampleValues"][0]["value"];
|
||||
//计算句子中去掉空格后的长度
|
||||
let len = exampleValue.replace(/\s+/g, "").length;
|
||||
//如果是文本类型的话,长度超过200就默认分割行
|
||||
if (len > 200 && msg["parameters"][i]["nodeType"] == 0 && msg["parameters"][i]["contentType"] == 0) {
|
||||
msg["parameters"][i]["splitLine"] = 1; //如果示例值长度超过200,就默认分割行
|
||||
showInfo(LANG("单个字段示例值长度超过200,已自动开启换行功能。", "The length of the example value of a single field exceeds 200, and the line break function has been automatically turned on."), 4000);
|
||||
}
|
||||
} catch (e) {
|
||||
console.log(e);
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
@ -188,20 +200,34 @@ function notifyParameterNum(num) {
|
||||
ws.send(JSON.stringify(message));
|
||||
}
|
||||
|
||||
function trailElement(node, type = 1) {
|
||||
// type=0代表标记节点,type=1代表试运行
|
||||
let parentNode = nodeList[actionSequence[node["parentId"]]];
|
||||
if (node.option == 10) { //条件分支的话,传父元素的父元素
|
||||
function updateParentNode() {
|
||||
// console.log("updateParentNode")
|
||||
let parentNode = nodeList[actionSequence[app._data.nowNode["parentId"]]];
|
||||
if (app._data.nowNode.option == 10) { //条件分支的话,传父元素的父元素
|
||||
parentNode = nodeList[actionSequence[parentNode["parentId"]]];
|
||||
}
|
||||
if (parentNode.option == 10) { //如果父元素是条件分支,传父元素的爷爷元素
|
||||
parentNode = nodeList[actionSequence[parentNode["parentId"]]];
|
||||
parentNode = nodeList[actionSequence[parentNode["parentId"]]];
|
||||
}
|
||||
app._data.parentNode = parentNode;
|
||||
}
|
||||
|
||||
function trailElement(node, type = 1) {
|
||||
// type=0代表标记节点,type=1代表试运行
|
||||
// let parentNode = nodeList[actionSequence[node["parentId"]]];
|
||||
// if (node.option == 10) { //条件分支的话,传父元素的父元素
|
||||
// parentNode = nodeList[actionSequence[parentNode["parentId"]]];
|
||||
// }
|
||||
// if (parentNode.option == 10) { //如果父元素是条件分支,传父元素的爷爷元素
|
||||
// parentNode = nodeList[actionSequence[parentNode["parentId"]]];
|
||||
// parentNode = nodeList[actionSequence[parentNode["parentId"]]];
|
||||
// }
|
||||
updateParentNode();
|
||||
let message = {
|
||||
type: 4, //消息类型,4代表试运行事件
|
||||
from: 1, //0代表从浏览器到流程图,1代表从流程图到浏览器
|
||||
message: {"type": type, "node": JSON.stringify(node), "parentNode": JSON.stringify(parentNode)}
|
||||
message: {"type": type, "node": JSON.stringify(node), "parentNode": JSON.stringify(app._data.parentNode)}
|
||||
};
|
||||
ws.send(JSON.stringify(message));
|
||||
console.log(node);
|
||||
@ -214,6 +240,7 @@ function handleElement() {
|
||||
app._data["nowNode"] = nodeList[vueData.nowNodeIndex];
|
||||
app._data["nodeType"] = app._data["nowNode"]["option"];
|
||||
app._data.useLoop = app._data["nowNode"]["parameters"]["useLoop"];
|
||||
app._data.xpath = app._data["nowNode"]["parameters"]["xpath"];
|
||||
app._data["codeMode"] = -1; //自定义初始化
|
||||
if (app._data["nodeType"] == 8) {
|
||||
app._data.loopType = app._data["nowNode"]["parameters"]["loopType"];
|
||||
@ -267,6 +294,7 @@ function addParameters(t) {
|
||||
t["parameters"]["afterJS"] = ""; //执行后执行的js
|
||||
t["parameters"]["afterJSWaitTime"] = 0; //执行后js等待时间
|
||||
t["parameters"]["alertHandleType"] = 0; //弹窗处理类型,1代表确认,2代表取消
|
||||
t["parameters"]["downloadWaitTime"] = 3600; //下载等待时间
|
||||
} else if (t.option == 3) { //提取数据
|
||||
t["parameters"]["clear"] = 0; //清空其他字段数据
|
||||
t["parameters"]["newLine"] = 1; //生成新行
|
||||
@ -359,7 +387,7 @@ function modifyParameters(t, param) {
|
||||
t["parameters"]["xpath"] = param["xpath"];
|
||||
t["parameters"]["useLoop"] = param["useLoop"];
|
||||
t["parameters"]["allXPaths"] = param["allXPaths"];
|
||||
if(param["type"] == "loopClickEvery"){
|
||||
if (param["type"] == "loopClickEvery") {
|
||||
t["parameters"]["newTab"] = 1; //循环点击每个元素,新标签页打开
|
||||
}
|
||||
} else if (t.option == 4) { //输入文字事件
|
||||
@ -389,7 +417,7 @@ function modifyParameters(t, param) {
|
||||
if (content.length > 15) {
|
||||
content = content.substring(0, 15) + "...";
|
||||
content = LANG(":", ": ") + content;
|
||||
} else if(content.length == 0){
|
||||
} else if (content.length == 0) {
|
||||
content = LANG("单个元素", " Single Element");
|
||||
} else {
|
||||
content = LANG(":", ": ") + content;
|
||||
@ -418,7 +446,7 @@ function modifyParameters(t, param) {
|
||||
}
|
||||
}
|
||||
|
||||
function showSuccess(msg, time = 4000) {
|
||||
function showSuccess(msg, time = 1000) {
|
||||
$("#tip").text(msg);
|
||||
$("#tip").slideDown(); //提示框
|
||||
let fadeout = setTimeout(function () {
|
||||
@ -463,7 +491,7 @@ if (mobile == "true") {
|
||||
}
|
||||
|
||||
let serviceInfo = {
|
||||
"version": "0.6.0"
|
||||
"version": "0.6.3"
|
||||
};
|
||||
|
||||
function saveService(type) {
|
||||
@ -597,7 +625,7 @@ function saveService(type) {
|
||||
"links": links,
|
||||
"create_time": $("#create_time").val(),
|
||||
"update_time": formatDateTime(new Date()),
|
||||
"version": "0.6.0",
|
||||
"version": "0.6.3",
|
||||
"saveThreshold": saveThreshold,
|
||||
// "cloudflare": cloudflare,
|
||||
"quitWaitTime": parseInt($("#quitWaitTime").val()),
|
||||
@ -652,8 +680,8 @@ if (sId != null && sId != -1) //加载任务
|
||||
if (!("cookies" in node["parameters"])) {
|
||||
node["parameters"]["cookies"] = "";
|
||||
}
|
||||
} else if(node["option"] == 3){ //提取数据
|
||||
if(node["parameters"]["paras"] != undefined && node["parameters"]["params"] == undefined){
|
||||
} else if (node["option"] == 3) { //提取数据
|
||||
if (node["parameters"]["paras"] != undefined && node["parameters"]["params"] == undefined) {
|
||||
node["parameters"]["params"] = node["parameters"]["paras"];
|
||||
}
|
||||
}
|
||||
@ -681,10 +709,4 @@ if (sId != null && sId != -1) //加载任务
|
||||
refresh(); //新增任务
|
||||
}
|
||||
|
||||
function LANG(zh, en) {
|
||||
if (window.location.href.indexOf("_CN") != -1) {
|
||||
return zh;
|
||||
} else {
|
||||
return en;
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -23,7 +23,7 @@
|
||||
|
||||
<body>
|
||||
<div class="row" style="margin-top: 40px" id="newTask">
|
||||
<div class="col-md-6" style="margin:0 auto;" style="text-align: center;">
|
||||
<div class="col-md-8" style="margin:0 auto;" style="text-align: center;">
|
||||
<nav aria-label="breadcrumb">
|
||||
<ol class="breadcrumb" style="padding-left:0;background-color: white">
|
||||
<li class="breadcrumb-item" @click="gotoHome"><a href="#">{{"Home~首页" | lang}}</a></li>
|
||||
@ -33,7 +33,7 @@
|
||||
<h4 style="text-align: center;">{{"New Task~新任务" | lang}}</h4>
|
||||
<div class="form-group">
|
||||
<label>{{"Please Input URL (http or https):~请输入网页网址(以http或https开头):" | lang}} </label>
|
||||
<textarea class="form-control" id="links" placeholder="links" style="min-height: 100px;">{{"https://www.ebay.com~https://www.jd.com" | lang}}</textarea>
|
||||
<textarea class="form-control" id="links" placeholder="links" style="min-height: 100px;">{{"https://www.ebay.com~https://www.baidu.com" | lang}}</textarea>
|
||||
</div>
|
||||
<button type="submit" id="send" class="btn btn-primary">{{"Start Design~开始设计" | lang}}</button>
|
||||
<!-- <div class="form-group" style="margin-top: 10px">-->
|
||||
|
@ -4,7 +4,8 @@
|
||||
<head>
|
||||
<script src="jquery-3.4.1.min.js"></script>
|
||||
<meta charset="UTF-8">
|
||||
<meta name="viewport" content="width=device-width, user-scalable=no, initial-scale=1.0, maximum-scale=1.0, minimum-scale=1.0">
|
||||
<meta name="viewport"
|
||||
content="width=device-width, user-scalable=no, initial-scale=1.0, maximum-scale=1.0, minimum-scale=1.0">
|
||||
<meta http-equiv="X-UA-Compatible" content="ie=edge">
|
||||
<script src="vue.js"></script>
|
||||
<link rel="stylesheet" href="bootstrap/css/bootstrap.css"></link>
|
||||
@ -18,7 +19,7 @@
|
||||
td,
|
||||
th,
|
||||
tr {
|
||||
border-color: black!important;
|
||||
border-color: black !important;
|
||||
text-overflow: ellipsis;
|
||||
overflow: hidden;
|
||||
white-space: nowrap;
|
||||
@ -34,84 +35,89 @@
|
||||
|
||||
<body>
|
||||
|
||||
<div class="row" style="margin-top: 40px;">
|
||||
<div class="row" style="margin-top: 40px;">
|
||||
|
||||
<div class="col-md-7" style="margin:0 auto" id="taskInfo" v-if="show">
|
||||
<nav aria-label="breadcrumb">
|
||||
<ol class="breadcrumb" style="padding-left:0;background-color: white">
|
||||
<li class="breadcrumb-item" @click="gotoHome"><a href="#">{{"Home~首页" | lang}}</a></li>
|
||||
<li class="breadcrumb-item active" aria-current="page" style="color: black">{{"Task Information~任务信息" | lang}}</li>
|
||||
</ol>
|
||||
</nav>
|
||||
<h4 style="text-align: center;">{{"Task Information~任务信息" | lang}}</h4>
|
||||
<p>{{"Task Name:~任务名称:" | lang}} {{task["name"]}}</p>
|
||||
<p style="word-wrap: break-word;word-break: break-all;overflow: hidden;max-height: 100px;">{{"Task Description:~任务描述:" | lang}} {{task["desc"]}}</p>
|
||||
<p style="word-wrap: break-word;word-break: break-all;overflow: hidden;max-height: 100px;">{{"Example URL:~样例网址:" | lang}} {{task["url"]}}</p>
|
||||
<p style="word-wrap: break-word;word-break: break-all;overflow: hidden;max-height: 100px;">{{"Create Time:~创建时间:" | lang}} {{dateFormat(task["create_time"])}}</p>
|
||||
<p style="word-wrap: break-word;word-break: break-all;overflow: hidden;max-height: 100px;">{{"Update Time:~更新时间:" | lang}} {{dateFormat(task["update_time"])}}</p>
|
||||
<p>{{"Operations (Please close this window and select 'Design Task' button if you want to modify task with a browser)~操作(如要带浏览器修改任务流程请关闭此窗口并选择设计任务)" | lang}}</p>
|
||||
<p><a style="margin-top: 5px" href="javascript:void(0)" v-on:click="modifyTask(task['id'],task['url'])" class="btn btn-primary">{{"Modify Task~修改任务" | lang}}</a>
|
||||
<a style="margin-top: 5px" href="javascript:void(0)" v-on:click="invokeTask(task['id'],task['url'])" class="btn btn-primary">{{"Execute Task~执行任务" | lang}}</a></p>
|
||||
<p>{{"Input Parameters~输入参数" | lang}}</p>
|
||||
<table class="table table-bordered">
|
||||
<tbody>
|
||||
<tr>
|
||||
<th style="min-width: 50px; text-align: center">ID</th>
|
||||
<th style="text-align: center">{{"Parameter Name~参数名称" | lang}}</th>
|
||||
<th style="text-align: center">{{"Invoke Name~调用名称" | lang}}</th>
|
||||
<th style="text-align: center">{{"Parameter Type~参数类型" | lang}}</th>
|
||||
<th>{{"Example Value~示例值" | lang}}</th>
|
||||
<th>{{"Parameter Description~参数描述" | lang}}</th>
|
||||
</tr>
|
||||
<tr v-if="task.inputParameters.length>0" v-for="i in task.inputParameters.length">
|
||||
<td style="min-width: 50px; text-align: center">{{i}}</td>
|
||||
<td style="text-align: center;white-space: initial">{{task.inputParameters[i-1]["nodeName"]}}</td>
|
||||
<td style="text-align: center">{{task.inputParameters[i-1]["name"]}}</td>
|
||||
<td style="text-align: center">{{task.inputParameters[i-1]["type"]}}</td>
|
||||
<td>{{task.inputParameters[i-1]["exampleValue"]}}</td>
|
||||
<td>{{task.inputParameters[i-1]["desc"]}}</td>
|
||||
</tr>
|
||||
<tr v-if="task.inputParameters.length==0">
|
||||
<td style="text-align: center">{{"Empty~无" | lang}}</td>
|
||||
<td style="text-align: center">{{"Empty~无" | lang}}</td>
|
||||
<td style="text-align: center">{{"Empty~无" | lang}}</td>
|
||||
<td style="text-align: center">{{"Empty~无" | lang}}</td>
|
||||
<td style="text-align: center">{{"Empty~无" | lang}}</td>
|
||||
<td style="text-align: center">{{"Empty~无" | lang}}</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
<p>{{"Output Parameters~输出参数" | lang}}</p>
|
||||
<table class="table table-bordered">
|
||||
<tbody>
|
||||
<tr>
|
||||
<th style="min-width: 50px; text-align: center">ID</th>
|
||||
<th style="text-align: center">{{"Parameter Name~参数名称" | lang}}</th>
|
||||
<th style="text-align: center">{{"Parameter Type~参数类型" | lang}}</th>
|
||||
<th>{{"Example Value~示例值" | lang}}</th>
|
||||
<th>{{"Parameter Description~参数描述" | lang}}</th>
|
||||
<th style="text-align: center">{{"Record as a field~作为字段保存" | lang}}</th>
|
||||
</tr>
|
||||
<tr v-if="task.outputParameters.length>0" v-for="i in task.outputParameters.length">
|
||||
<td style="min-width: 50px; text-align: center">{{i}}</td>
|
||||
<td style="text-align: center">{{task.outputParameters[i-1]["name"]}}</td>
|
||||
<td style="text-align: center">{{task.outputParameters[i-1]["type"]}}</td>
|
||||
<td>{{task.outputParameters[i-1]["exampleValue"]}}</td>
|
||||
<td>{{task.outputParameters[i-1]["desc"]}}</td>
|
||||
<td style="text-align: center">{{task.outputParameters[i-1]["recordASField"] == 1? "Yes~是": "No~否" | lang}}</td>
|
||||
</tr>
|
||||
<tr v-if="task.outputParameters.length==0">
|
||||
<td style="min-width: 50px;text-align: center">{{"Empty~无" | lang}}</td>
|
||||
<td style="text-align: center">{{"Empty~无" | lang}}</td>
|
||||
<td style="text-align: center">{{"Empty~无" | lang}}</td>
|
||||
<td style="text-align: center">{{"Empty~无" | lang}}</td>
|
||||
<td style="text-align: center">{{"Empty~无" | lang}}</td>
|
||||
<td style="text-align: center">{{"Empty~无" | lang}}</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
<div class="col-md-8" style="margin:0 auto" id="taskInfo" v-if="show">
|
||||
<nav aria-label="breadcrumb">
|
||||
<ol class="breadcrumb" style="padding-left:0;background-color: white">
|
||||
<li class="breadcrumb-item" @click="gotoHome"><a href="#">{{"Home~首页" | lang}}</a></li>
|
||||
<li class="breadcrumb-item active" aria-current="page"
|
||||
style="color: black">{{"Task Information~任务信息" | lang}}
|
||||
</li>
|
||||
</ol>
|
||||
</nav>
|
||||
<h4 style="text-align: center;">{{"Task Information~任务信息" | lang}}</h4>
|
||||
<p>{{"Task Name:~任务名称:" | lang}} {{task["name"]}}</p>
|
||||
<p style="word-wrap: break-word;word-break: break-all;overflow: hidden;max-height: 100px;">{{"Task Description:~任务描述:" | lang}} {{task["desc"]}}</p>
|
||||
<p style="word-wrap: break-word;word-break: break-all;overflow: hidden;max-height: 100px;">{{"Example URL:~样例网址:" | lang}} {{task["url"]}}</p>
|
||||
<p style="word-wrap: break-word;word-break: break-all;overflow: hidden;max-height: 100px;">{{"Create Time:~创建时间:" | lang}} {{dateFormat(task["create_time"])}}</p>
|
||||
<p style="word-wrap: break-word;word-break: break-all;overflow: hidden;max-height: 100px;">{{"Update Time:~更新时间:" | lang}} {{dateFormat(task["update_time"])}}</p>
|
||||
<p>{{"Operations (Please close this window and select 'Design Task' button if you want to modify task with a browser)~操作(如要带浏览器修改任务流程请关闭此窗口并选择设计任务)" | lang}}</p>
|
||||
<p><a style="margin-top: 5px" href="javascript:void(0)" v-on:click="modifyTask(task['id'],task['url'])"
|
||||
class="btn btn-primary">{{"Modify Task~修改任务" | lang}}</a>
|
||||
<a style="margin-top: 5px" href="javascript:void(0)" v-on:click="invokeTask(task['id'],task['url'])"
|
||||
class="btn btn-primary">{{"Execute Task~执行任务" | lang}}</a></p>
|
||||
<p>{{"Input Parameters~输入参数" | lang}}</p>
|
||||
<table class="table table-bordered">
|
||||
<tbody>
|
||||
<tr>
|
||||
<th style="min-width: 50px; text-align: center">ID</th>
|
||||
<th style="text-align: center">{{"Parameter Name~参数名称" | lang}}</th>
|
||||
<th style="text-align: center">{{"Invoke Name~调用名称" | lang}}</th>
|
||||
<th style="text-align: center">{{"Parameter Type~参数类型" | lang}}</th>
|
||||
<th>{{"Example Value~示例值" | lang}}</th>
|
||||
<th>{{"Parameter Description~参数描述" | lang}}</th>
|
||||
</tr>
|
||||
<tr v-if="task.inputParameters.length>0" v-for="i in task.inputParameters.length">
|
||||
<td style="min-width: 50px; text-align: center">{{i}}</td>
|
||||
<td style="text-align: center;white-space: initial">{{task.inputParameters[i-1]["nodeName"]}}</td>
|
||||
<td style="text-align: center">{{task.inputParameters[i-1]["name"]}}</td>
|
||||
<td style="text-align: center">{{task.inputParameters[i-1]["type"]}}</td>
|
||||
<td>{{task.inputParameters[i-1]["exampleValue"]}}</td>
|
||||
<td>{{task.inputParameters[i-1]["desc"]}}</td>
|
||||
</tr>
|
||||
<tr v-if="task.inputParameters.length==0">
|
||||
<td style="text-align: center">{{"Empty~无" | lang}}</td>
|
||||
<td style="text-align: center">{{"Empty~无" | lang}}</td>
|
||||
<td style="text-align: center">{{"Empty~无" | lang}}</td>
|
||||
<td style="text-align: center">{{"Empty~无" | lang}}</td>
|
||||
<td style="text-align: center">{{"Empty~无" | lang}}</td>
|
||||
<td style="text-align: center">{{"Empty~无" | lang}}</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
<p>{{"Output Parameters~输出参数" | lang}}</p>
|
||||
<table class="table table-bordered">
|
||||
<tbody>
|
||||
<tr>
|
||||
<th style="min-width: 50px; text-align: center">ID</th>
|
||||
<th style="text-align: center">{{"Parameter Name~参数名称" | lang}}</th>
|
||||
<th style="text-align: center">{{"Parameter Type~参数类型" | lang}}</th>
|
||||
<th>{{"Example Value~示例值" | lang}}</th>
|
||||
<th>{{"Parameter Description~参数描述" | lang}}</th>
|
||||
<th style="text-align: center">{{"Record as a field~作为字段保存" | lang}}</th>
|
||||
</tr>
|
||||
<tr v-if="task.outputParameters.length>0" v-for="i in task.outputParameters.length">
|
||||
<td style="min-width: 50px; text-align: center">{{i}}</td>
|
||||
<td style="text-align: center">{{task.outputParameters[i-1]["name"]}}</td>
|
||||
<td style="text-align: center">{{task.outputParameters[i-1]["type"]}}</td>
|
||||
<td>{{task.outputParameters[i-1]["exampleValue"]}}</td>
|
||||
<td>{{task.outputParameters[i-1]["desc"]}}</td>
|
||||
<td style="text-align: center">{{task.outputParameters[i-1]["recordASField"] == 1? "Yes~是": "No~否" | lang}}
|
||||
</td>
|
||||
</tr>
|
||||
<tr v-if="task.outputParameters.length==0">
|
||||
<td style="min-width: 50px;text-align: center">{{"Empty~无" | lang}}</td>
|
||||
<td style="text-align: center">{{"Empty~无" | lang}}</td>
|
||||
<td style="text-align: center">{{"Empty~无" | lang}}</td>
|
||||
<td style="text-align: center">{{"Empty~无" | lang}}</td>
|
||||
<td style="text-align: center">{{"Empty~无" | lang}}</td>
|
||||
<td style="text-align: center">{{"Empty~无" | lang}}</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
</body>
|
||||
|
||||
@ -128,16 +134,16 @@
|
||||
},
|
||||
methods: {
|
||||
dateFormat: DateFormat,
|
||||
gotoHome:function(){
|
||||
let url = "";
|
||||
if(getUrlParam("lang")=="zh"){
|
||||
url = "taskList.html?lang=zh&type="+getUrlParam("type")+"&wsport="+getUrlParam("wsport")+"&backEndAddressServiceWrapper="+ app.$data.backEndAddressServiceWrapper
|
||||
} else{
|
||||
url = "taskList.html?lang=en&type="+getUrlParam("type")+"&wsport="+getUrlParam("wsport")+"&backEndAddressServiceWrapper="+ app.$data.backEndAddressServiceWrapper
|
||||
}
|
||||
window.location.href= url;
|
||||
gotoHome: function () {
|
||||
let url = "";
|
||||
if (getUrlParam("lang") == "zh") {
|
||||
url = "taskList.html?lang=zh&type=" + getUrlParam("type") + "&wsport=" + getUrlParam("wsport") + "&backEndAddressServiceWrapper=" + app.$data.backEndAddressServiceWrapper
|
||||
} else {
|
||||
url = "taskList.html?lang=en&type=" + getUrlParam("type") + "&wsport=" + getUrlParam("wsport") + "&backEndAddressServiceWrapper=" + app.$data.backEndAddressServiceWrapper
|
||||
}
|
||||
window.location.href = url;
|
||||
},
|
||||
modifyTask: function(id, url) {
|
||||
modifyTask: function (id, url) {
|
||||
let message = { //显示flowchart
|
||||
type: 1, //消息类型,传递链接
|
||||
message: {
|
||||
@ -146,21 +152,20 @@
|
||||
};
|
||||
// ws.send(JSON.stringify(message));
|
||||
// window.location.href = url; //跳转链接
|
||||
if(getUrlParam("lang")=="zh"){
|
||||
window.location.href = "FlowChart_CN.html?type="+getUrlParam("type")+"&lang="+getUrlParam("lang")+"&id=" + id + "&wsport="+getUrlParam("wsport")+"&backEndAddressServiceWrapper="+ app.$data.backEndAddressServiceWrapper
|
||||
} else{
|
||||
window.location.href = "FlowChart.html?type="+getUrlParam("type")+"&lang="+getUrlParam("lang")+"&id=" + id + "&wsport="+getUrlParam("wsport")+"&backEndAddressServiceWrapper="+ app.$data.backEndAddressServiceWrapper
|
||||
if (getUrlParam("lang") == "zh") {
|
||||
window.location.href = "FlowChart_CN.html?type=" + getUrlParam("type") + "&lang=" + getUrlParam("lang") + "&id=" + id + "&wsport=" + getUrlParam("wsport") + "&backEndAddressServiceWrapper=" + app.$data.backEndAddressServiceWrapper
|
||||
} else {
|
||||
window.location.href = "FlowChart.html?type=" + getUrlParam("type") + "&lang=" + getUrlParam("lang") + "&id=" + id + "&wsport=" + getUrlParam("wsport") + "&backEndAddressServiceWrapper=" + app.$data.backEndAddressServiceWrapper
|
||||
}
|
||||
|
||||
},
|
||||
invokeTask: function(id) {
|
||||
window.location.href = "executeTask.html?type="+getUrlParam("type")+"&lang="+getUrlParam("lang")+"&id=" + id + "&wsport="+getUrlParam("wsport")+"&backEndAddressServiceWrapper="+ app.$data.backEndAddressServiceWrapper;
|
||||
invokeTask: function (id) {
|
||||
window.location.href = "executeTask.html?type=" + getUrlParam("type") + "&lang=" + getUrlParam("lang") + "&id=" + id + "&wsport=" + getUrlParam("wsport") + "&backEndAddressServiceWrapper=" + app.$data.backEndAddressServiceWrapper;
|
||||
},
|
||||
}
|
||||
});
|
||||
$.get(app.$data.backEndAddressServiceWrapper + "/queryTask?id=" + sId, function(result) {
|
||||
$.get(app.$data.backEndAddressServiceWrapper + "/queryTask?id=" + sId, function (result) {
|
||||
app.$data.task = result;
|
||||
app.$data.show = true;
|
||||
});
|
||||
|
||||
</script>
|
||||
|
@ -4,85 +4,171 @@
|
||||
<head>
|
||||
<script src="jquery-3.4.1.min.js"></script>
|
||||
<meta charset="UTF-8">
|
||||
<meta name="viewport" content="width=device-width, user-scalable=no, initial-scale=1.0, maximum-scale=1.0, minimum-scale=1.0">
|
||||
<meta name="viewport"
|
||||
content="width=device-width, user-scalable=no, initial-scale=1.0, maximum-scale=1.0, minimum-scale=1.0">
|
||||
<meta http-equiv="X-UA-Compatible" content="ie=edge">
|
||||
<script src="vue.js"></script>
|
||||
<link rel="stylesheet" href="bootstrap/css/bootstrap.css"></link>
|
||||
<link rel="stylesheet" href="element-ui/index.css"></link>
|
||||
<script src="element-ui/index.js"></script>
|
||||
<title>任务列表 | Task List</title>
|
||||
|
||||
</head>
|
||||
<style>
|
||||
th,td{
|
||||
th, td {
|
||||
text-align: left;
|
||||
vertical-align: middle!important;
|
||||
vertical-align: middle !important;
|
||||
}
|
||||
|
||||
@media (max-width: 500px) {
|
||||
.tasklist{
|
||||
margin-left:10%!important;
|
||||
.tasklist {
|
||||
margin-left: 10% !important;
|
||||
|
||||
}
|
||||
}
|
||||
|
||||
.search-header {
|
||||
display: flex;
|
||||
justify-content: flex-end; /* Right align the search box */
|
||||
align-items: center;
|
||||
}
|
||||
|
||||
.search-input {
|
||||
/*margin-right: 8px; !* Optional: Adjust spacing between input and button *!*/
|
||||
}
|
||||
|
||||
.task-links {
|
||||
display: flex;
|
||||
justify-content: space-between; /* Spread links evenly */
|
||||
}
|
||||
</style>
|
||||
<body>
|
||||
<div class="row" style="margin-top: 40px;">
|
||||
<div style="margin:0 auto; min-width: 70%;" id="taskList" class="tasklist">
|
||||
<h4 style="text-align: center;">{{"Task List~任务列表" | lang}}</h4>
|
||||
<h5 style="text-align: center;" v-if="mobile==1">{{"View this table by direction keys on keyboard~按键盘方向键浏览此表格" | lang}}</h5>
|
||||
<p><a v-if="type==3" href="javascript:void(0)" v-on:click="newTask" class="btn btn-primary">{{"New Task~创建新任务" | lang}}</a></p>
|
||||
<div v-if="type != 3" style="margin-bottom: 20px">
|
||||
<div style="margin-bottom: 5px">{{"提示:下方的官方教程和答疑平台均在Github,可能出现访问速度慢的问题,请耐心等待。~" | lang}}</div>
|
||||
<a class="btn btn-primary" href="https://github.com/NaiboWang/EasySpider/wiki" target="_blank">{{"Software Documentation~软件使用说明文档" | lang}}</a>
|
||||
<a class="btn btn-primary" href="https://github.com/NaiboWang/EasySpider/issues?q=is%3Aissue" target="_blank">{{"Ask questions here~官方答疑平台" | lang}}</a>
|
||||
<a class="btn btn-primary" href="https://github.com/NaiboWang/EasySpider/issues/22" target="_blank">{{"See how to run task by schedule~定时执行任务教程" | lang}}</a>
|
||||
<!-- <a class="btn btn-primary" href="https://github.com/NaiboWang/EasySpider/wiki/Run-multiple-tasks-in-parallel" target="_blank">{{"See how to run multiple tasks in parallel~同时执行多个任务教程" | lang}}</a>-->
|
||||
<div class="row" style="margin-top: 40px;">
|
||||
<div style="margin:0 auto; min-width: 70%;" id="taskList" class="tasklist">
|
||||
<h4 style="text-align: center;">{{"Task List~任务列表" | lang}}</h4>
|
||||
<h5 style="text-align: center;"
|
||||
v-if="mobile==1">{{"View this table by direction keys on keyboard~按键盘方向键浏览此表格" | lang}}</h5>
|
||||
<p><a v-if="type==3" href="javascript:void(0)" v-on:click="newTask"
|
||||
class="btn btn-primary">{{"New Task~创建新任务" | lang}}</a></p>
|
||||
<div v-if="type != 3" style="margin-bottom: 20px">
|
||||
<div style="margin-bottom: 5px">{{"提示:下方的官方教程和答疑平台均在Github,可能出现访问速度慢的问题,请耐心等待。~" | lang}}
|
||||
</div>
|
||||
<div style="margin-bottom: 10px">
|
||||
<table style="table-layout: auto;" class="table table-hover">
|
||||
<thead>
|
||||
<tr>
|
||||
<th style="text-align: center">No.</th>
|
||||
<th style="text-align: center">ID</th>
|
||||
<th style="text-align: center">{{"Task Name~任务名称" | lang}}</th>
|
||||
<th>{{"URL~网址" | lang}}</th>
|
||||
<th v-bind:colspan="type" style="min-width: 300px">{{"Operations~操作" | lang}}</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
<tr v-for="i in list.length">
|
||||
<td style="text-align: center">{{i}}</td>
|
||||
<td style="text-align: center">{{list[i-1]["id"]}}</td>
|
||||
<!-- <td style="overflow: hidden;; max-width: 200px;text-align: center">{{list[i-1]["id"]}}</td>-->
|
||||
<td style="overflow: hidden;; max-width: 200px;text-align: center">{{list[i-1]["name"]}}</td>
|
||||
<td style="height: 30px;overflow: hidden; max-width: 200px">{{list[i-1]["url"]}}</td>
|
||||
<td style="text-align: left"><a href="javascript:void(0)" v-on:click="browseTask(list[i-1]['id'])">{{"Task Information~任务信息" | lang}}</a></td>
|
||||
<td style="text-align: left;font-weight: bold" v-if="type==3"><a href="javascript:void(0)" v-on:click="modifyTask(list[i-1]['id'],list[i-1]['url'])">{{"Modify Task~修改任务" | lang}}</a></td>
|
||||
<td style="text-align: left"><a disabled href="javascript:void(0)" v-on:dblclick="deleteTask(list[i-1]['id'])">{{"Delete Task (Double Click)~删除任务(双击)" | lang}}</a></td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
|
||||
<a class="btn btn-primary" href="https://github.com/NaiboWang/EasySpider/wiki"
|
||||
target="_blank">{{"Software Documentation~软件使用说明文档" | lang}}</a>
|
||||
<a class="btn btn-primary" href="https://github.com/NaiboWang/EasySpider/issues?q=is%3Aissue"
|
||||
target="_blank">{{"Ask questions here~官方答疑平台" | lang}}</a>
|
||||
<a class="btn btn-primary" href="https://github.com/NaiboWang/EasySpider/issues/22"
|
||||
target="_blank">{{"See how to run task by schedule~定时执行任务教程" | lang}}</a>
|
||||
<!-- <a class="btn btn-primary" href="https://github.com/NaiboWang/EasySpider/wiki/Run-multiple-tasks-in-parallel" target="_blank">{{"See how to run multiple tasks in parallel~同时执行多个任务教程" | lang}}</a>-->
|
||||
</div>
|
||||
</div>
|
||||
<el-table
|
||||
style="width: 100%"
|
||||
:empty-text="LANG('No Task~暂无任务')"
|
||||
:data="list.filter(data => !search || (data.name.toLowerCase().includes(search.toLowerCase())) || (data.url.toLowerCase().includes(search.toLowerCase())) || (data.links.includes(search.toLowerCase())) || (data.desc.includes(search.toLowerCase())) || (data.id.toString().includes(search.toLowerCase())))"
|
||||
:default-sort="{prop: 'mtime', order: 'descending'}"
|
||||
>
|
||||
|
||||
<el-table-column
|
||||
prop="id"
|
||||
:label="LANG('Task ID~任务ID')"
|
||||
sortable
|
||||
width="120"
|
||||
align="center"
|
||||
>
|
||||
</el-table-column>
|
||||
<el-table-column
|
||||
prop="name"
|
||||
:label="LANG('Task Name~任务名称')"
|
||||
sortable
|
||||
align="center"
|
||||
>
|
||||
</el-table-column>
|
||||
<el-table-column
|
||||
prop="url"
|
||||
label="URL"
|
||||
sortable
|
||||
>
|
||||
</el-table-column>
|
||||
<!-- <el-table-column-->
|
||||
<!-- prop="mtime"-->
|
||||
<!-- :label="LANG('Update Time~更新时间')"-->
|
||||
<!-- sortable-->
|
||||
<!-- :formatter="formatDate"-->
|
||||
<!-- width="170"-->
|
||||
<!-- >-->
|
||||
</el-table-column>
|
||||
<el-table-column
|
||||
width="350"
|
||||
align="center">
|
||||
<!-- Header template for the search input -->
|
||||
<template slot="header" slot-scope="scope">
|
||||
<div class="search-header">
|
||||
<!-- Search input aligned to the right -->
|
||||
<el-input
|
||||
v-model="search"
|
||||
class="search-input"
|
||||
prefix-icon="el-icon-search"
|
||||
:placeholder="LANG('Please input keywords to search~请输入关键词搜索')">
|
||||
</el-input>
|
||||
<!-- <el-button icon="el-icon-search"></el-button>-->
|
||||
</div>
|
||||
</template>
|
||||
<template slot-scope="scope">
|
||||
<!-- Use flex container to justify content space-around -->
|
||||
<div class="task-links">
|
||||
<a href="javascript:void(0)" v-on:click="browseTask(scope.$index, scope.row)">{{ "View~任务信息"
|
||||
| lang }}</a>
|
||||
<a href="javascript:void(0)" v-if="type==3" v-on:click="modifyTask(scope.$index, scope.row)">{{
|
||||
"Modify~修改任务" | lang }}</a>
|
||||
<a href="javascript:void(0)"
|
||||
v-on:dblclick="deleteTask(scope.$index, scope.row)">{{ "Delete (Double Click)~删除任务(双击)" | lang }}</a>
|
||||
</div>
|
||||
</template>
|
||||
</el-table-column>
|
||||
</el-table>
|
||||
</div>
|
||||
</div>
|
||||
</body>
|
||||
|
||||
</html>
|
||||
<script src="global.js"></script>
|
||||
<script>
|
||||
var app = new Vue({
|
||||
let app = new Vue({
|
||||
el: '#taskList',
|
||||
data: {
|
||||
search: '',
|
||||
list: [],
|
||||
type: 3, //记录服务行为
|
||||
mobile: getUrlParam("mobile"),
|
||||
backEndAddressServiceWrapper: getUrlParam("backEndAddressServiceWrapper"),
|
||||
},
|
||||
methods: {
|
||||
newTask: function (){
|
||||
window.location.href = "newTask.html?lang="+getUrlParam("lang")+"&mobile="+getUrlParam("mobile")+"&wsport="+getUrlParam("wsport")+"&backEndAddressServiceWrapper="+ app.$data.backEndAddressServiceWrapper;
|
||||
formatDate: function (row, column) {
|
||||
//2023-01-01 00:00:00
|
||||
let date = row[column.property];
|
||||
// 2023-12-26T12:44:32.599Z
|
||||
let original_time = row.mtime;
|
||||
let year = original_time.substring(0, 4);
|
||||
let month = original_time.substring(5, 7);
|
||||
let day = original_time.substring(8, 10);
|
||||
let hour = original_time.substring(11, 13);
|
||||
let minute = original_time.substring(14, 16);
|
||||
let second = original_time.substring(17, 19);
|
||||
return year + "-" + month + "-" + day + " " + hour + ":" + minute + ":" + second;
|
||||
},
|
||||
modifyTask: function(id, url) {
|
||||
newTask: function () {
|
||||
window.location.href = "newTask.html?lang=" + getUrlParam("lang") + "&mobile=" + getUrlParam("mobile") + "&wsport=" + getUrlParam("wsport") + "&backEndAddressServiceWrapper=" + app.$data.backEndAddressServiceWrapper;
|
||||
},
|
||||
LANG: function (text) {
|
||||
if (getUrlParam("lang") == "en" || getUrlParam("lang") == "") {
|
||||
return text.split("~")[0];
|
||||
} else if (getUrlParam("lang") == "zh") {
|
||||
return text.split("~")[1];
|
||||
}
|
||||
},
|
||||
modifyTask: function (index, row) {
|
||||
let id = row.id;
|
||||
let url = row.links.split("\n")[0];
|
||||
console.log(index, row)
|
||||
let message = { //显示flowchart
|
||||
type: 1, //消息类型,传递链接
|
||||
message: {
|
||||
@ -92,10 +178,12 @@
|
||||
ws.send(JSON.stringify(message));
|
||||
window.location.href = url; //跳转链接
|
||||
},
|
||||
browseTask: function(id) {
|
||||
window.location.href = "taskInfo.html?type="+getUrlParam("type")+"&id=" + id + "&lang="+getUrlParam("lang")+"&wsport="+getUrlParam("wsport")+"&backEndAddressServiceWrapper="+ app.$data.backEndAddressServiceWrapper; //跳转链接
|
||||
browseTask: function (index, row) {
|
||||
let id = row.id;
|
||||
window.location.href = "taskInfo.html?type=" + getUrlParam("type") + "&id=" + id + "&lang=" + getUrlParam("lang") + "&wsport=" + getUrlParam("wsport") + "&backEndAddressServiceWrapper=" + app.$data.backEndAddressServiceWrapper; //跳转链接
|
||||
},
|
||||
deleteTask: function(id) {
|
||||
deleteTask: function (index, row) {
|
||||
let id = row.id;
|
||||
// let text = "Are you sure to remove the selected task?";
|
||||
// if (getUrlParam("lang") == "en"|| getUrlParam("lang")=="") {
|
||||
// text = "Are you sure to remove the selected task?";
|
||||
@ -103,30 +191,30 @@
|
||||
// text = "确定要删除选中的任务吗?";
|
||||
// }
|
||||
// if (confirm(text)) {
|
||||
$.get(app.$data.backEndAddressServiceWrapper + "/deleteTask?id=" + id, function(res) {
|
||||
$.get(app.$data.backEndAddressServiceWrapper + "/queryTasks", function(re) {
|
||||
result = re.sort(desc);
|
||||
app.$data.list = result;
|
||||
});
|
||||
$.get(app.$data.backEndAddressServiceWrapper + "/deleteTask?id=" + id, function (res) {
|
||||
$.get(app.$data.backEndAddressServiceWrapper + "/queryTasks", function (re) {
|
||||
result = re.sort(desc);
|
||||
app.$data.list = result;
|
||||
});
|
||||
// alert("Sorry, the task cannot be deleted since the system is a demo system for paper reviewers, please contact the author (naibowang@u.nus.edu) to remove it.")
|
||||
});
|
||||
// alert("Sorry, the task cannot be deleted since the system is a demo system for paper reviewers, please contact the author (naibowang@u.nus.edu) to remove it.")
|
||||
// }
|
||||
},
|
||||
}
|
||||
});
|
||||
|
||||
var desc = function(x, y) {
|
||||
let desc = function (x, y) {
|
||||
return (x["id"] < y["id"]) ? 1 : -1
|
||||
}
|
||||
$.get(app.$data.backEndAddressServiceWrapper + "/queryTasks", function(re) {
|
||||
$.get(app.$data.backEndAddressServiceWrapper + "/queryTasks", function (re) {
|
||||
// result = re.sort(desc);
|
||||
app.$data.list = re;
|
||||
if (getUrlParam("type") == "1") {
|
||||
app.$data.type = 2;
|
||||
}
|
||||
});
|
||||
ws = new WebSocket("ws://localhost:"+getUrlParam("wsport"));
|
||||
ws.onopen = function() {
|
||||
ws = new WebSocket("ws://localhost:" + getUrlParam("wsport"));
|
||||
ws.onopen = function () {
|
||||
// Web Socket 已连接上,使用 send() 方法发送数据
|
||||
console.log("已连接");
|
||||
message = {
|
||||
@ -137,7 +225,7 @@
|
||||
};
|
||||
this.send(JSON.stringify(message));
|
||||
};
|
||||
ws.onclose = function() {
|
||||
ws.onclose = function () {
|
||||
// 关闭 websocket
|
||||
console.log("连接已关闭...");
|
||||
};
|
||||
|
3
ElectronJS/start_server.js
Normal file
3
ElectronJS/start_server.js
Normal file
@ -0,0 +1,3 @@
|
||||
const path = require("path");
|
||||
const task_server = require(path.join(__dirname, "server.js"));
|
||||
task_server.start(8074); //start local server
|
2
ElectronJS/stealth.min.js
vendored
2
ElectronJS/stealth.min.js
vendored
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
@ -1 +1 @@
|
||||
{"id":228,"name":"[2312.02977] Exploring the nonclassical dynamics of the \"classical'' Schrödinger equation","url":"https://arxiv.org/abs/2312.02977","links":"https://arxiv.org/abs/2312.02977","create_time":"12/7/2023, 2:44:58 AM","update_time":"12/7/2023, 2:56:47 AM","version":"0.6.0","saveThreshold":10,"quitWaitTime":60,"environment":1,"maximizeWindow":0,"maxViewLength":15,"recordLog":1,"outputFormat":"xlsx","saveName":"current_time","inputExcel":"","startFromExit":0,"pauseKey":"p","containJudge":false,"desc":"https://arxiv.org/abs/2312.02977","inputParameters":[{"id":0,"name":"urlList_0","nodeId":1,"nodeName":"打开网页","value":"https://arxiv.org/abs/2312.02977","desc":"要采集的网址列表,多行以\\n分开","type":"text","exampleValue":"https://arxiv.org/abs/2312.02977"}],"outputParameters":[],"graph":[{"index":0,"id":0,"parentId":0,"type":-1,"option":0,"title":"root","sequence":[1,2],"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0},"isInLoop":false},{"id":1,"index":1,"parentId":0,"type":0,"option":1,"title":"打开网页","sequence":[],"isInLoop":false,"position":0,"parameters":{"useLoop":false,"xpath":"","wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"url":"https://arxiv.org/abs/2312.02977","links":"https://arxiv.org/abs/2312.02977","maxWaitTime":10,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"cookies":""}},{"id":2,"index":2,"parentId":0,"type":0,"option":2,"title":"点击Download PDF","sequence":[],"isInLoop":false,"position":1,"parameters":{"history":4,"tabIndex":-1,"useLoop":false,"xpath":"//*[contains(@class, \"download-pdf\")]","iframe":false,"wait":2,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"clickWay":0,"maxWaitTime":10,"params":[],"alertHandleType":0,"allXPaths":["/html/body/div[2]/main[1]/div[1]/div[1]/div[2]/div[1]/ul[1]/li[1]/a[1]","//a[contains(., 'Download P')]","//A[@class='abs-button download-pdf']","/html/body/div[last()-3]/main/div/div/div[last()-2]/div[last()-5]/ul/li[last()-2]/a"]}}]}
|
||||
{"id":228,"name":"[2312.02977] Exploring the nonclassical dynamics of the \"classical'' Schrödinger equation","url":"https://arxiv.org/abs/2312.02977","links":"https://arxiv.org/abs/2312.02977","create_time":"12/7/2023, 2:44:58 AM","update_time":"2024-01-05 22:08:46","version":"0.6.0","saveThreshold":10,"quitWaitTime":3,"environment":1,"maximizeWindow":0,"maxViewLength":15,"recordLog":1,"outputFormat":"xlsx","saveName":"TTT","dataWriteMode":3,"inputExcel":"","startFromExit":0,"pauseKey":"p","containJudge":false,"browser":"chrome","removeDuplicate":0,"desc":"https://arxiv.org/abs/2312.02977","inputParameters":[{"id":0,"name":"urlList_0","nodeId":1,"nodeName":"打开网页","value":"https://arxiv.org/abs/2312.02977","desc":"要采集的网址列表,多行以\\n分开","type":"text","exampleValue":"https://arxiv.org/abs/2312.02977"},{"id":1,"name":"loopTimes_1","nodeId":5,"nodeName":"循环 - 单个元素","desc":"循环循环 - 单个元素执行的次数(0代表无限循环)","type":"int","exampleValue":10,"value":10}],"outputParameters":[],"graph":[{"index":0,"id":0,"parentId":0,"type":-1,"option":0,"title":"root","sequence":[1,5],"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0},"isInLoop":false},{"id":1,"index":1,"parentId":0,"type":0,"option":1,"title":"打开网页","sequence":[],"isInLoop":false,"position":0,"parameters":{"useLoop":false,"xpath":"","wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"url":"https://arxiv.org/abs/2312.02977","links":"https://arxiv.org/abs/2312.02977","maxWaitTime":10,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"cookies":""}},{"id":3,"index":2,"parentId":2,"type":0,"option":2,"title":"点击Download PDF","sequence":[],"isInLoop":true,"position":0,"parameters":{"history":4,"tabIndex":-1,"useLoop":false,"xpath":"//*[contains(@class, \"download-pdf\")]","iframe":false,"wait":2,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"clickWay":0,"maxWaitTime":10,"params":[],"alertHandleType":0,"allXPaths":["/html/body/div[2]/main[1]/div[1]/div[1]/div[2]/div[1]/ul[1]/li[1]/a[1]","//a[contains(., 'Download P')]","//A[@class='abs-button download-pdf']","/html/body/div[last()-3]/main/div/div/div[last()-2]/div[last()-5]/ul/li[last()-2]/a"]}},{"id":-1,"index":3,"parentId":0,"type":0,"option":2,"title":"点击Download PDF","sequence":[],"isInLoop":false,"position":2,"parameters":{"history":4,"tabIndex":-1,"useLoop":false,"xpath":"//*[contains(@class, \"download-pdf\")]","iframe":false,"wait":2,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"clickWay":0,"maxWaitTime":10,"params":[],"alertHandleType":0,"allXPaths":["/html/body/div[2]/main[1]/div[1]/div[1]/div[2]/div[1]/ul[1]/li[1]/a[1]","//a[contains(., 'Download P')]","//A[@class='abs-button download-pdf']","/html/body/div[last()-3]/main/div/div/div[last()-2]/div[last()-5]/ul/li[last()-2]/a"]}},{"id":-1,"index":4,"parentId":0,"type":0,"option":2,"title":"点击Download PDF","sequence":[],"isInLoop":false,"position":3,"parameters":{"history":4,"tabIndex":-1,"useLoop":false,"xpath":"//*[contains(@class, \"download-pdf\")]","iframe":false,"wait":2,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"clickWay":0,"maxWaitTime":10,"params":[],"alertHandleType":0,"allXPaths":["/html/body/div[2]/main[1]/div[1]/div[1]/div[2]/div[1]/ul[1]/li[1]/a[1]","//a[contains(., 'Download P')]","//A[@class='abs-button download-pdf']","/html/body/div[last()-3]/main/div/div/div[last()-2]/div[last()-5]/ul/li[last()-2]/a"]}},{"id":2,"index":5,"parentId":0,"type":1,"option":8,"title":"循环 - 单个元素","sequence":[2],"isInLoop":false,"position":1,"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"//body","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"loopType":0,"pathList":"","textList":"","code":"","waitTime":0,"exitCount":10,"exitElement":"//body","historyWait":2,"breakMode":0,"breakCode":"","breakCodeWaitTime":0,"skipCount":0}}]}
|
@ -1 +1 @@
|
||||
{"id":229,"name":"知乎 - 有问题,就会有答案","url":"https://www.zhihu.com","links":"https://www.zhihu.com","create_time":"07/12/2023, 03:26:24","update_time":"07/12/2023, 03:43:34","version":"0.6.0","saveThreshold":10,"quitWaitTime":6,"environment":0,"maximizeWindow":0,"maxViewLength":15,"recordLog":1,"outputFormat":"xlsx","saveName":"current_time","inputExcel":"","startFromExit":0,"pauseKey":"t","containJudge":false,"desc":"https://www.zhihu.com","inputParameters":[{"id":0,"name":"urlList_0","nodeId":1,"nodeName":"打开网页","value":"https://www.zhihu.com","desc":"要采集的网址列表,多行以\\n分开","type":"text","exampleValue":"https://www.zhihu.com"}],"outputParameters":[{"id":0,"name":"参数1_文本","desc":"","type":"text","recordASField":1,"exampleValue":"死刑执行前可以谎称肚子痛,想排泄粪便,籍此拖延时间吗?"}],"graph":[{"index":0,"id":0,"parentId":0,"type":-1,"option":0,"title":"root","sequence":[1,2],"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0},"isInLoop":false},{"id":1,"index":1,"parentId":0,"type":0,"option":1,"title":"打开网页","sequence":[],"isInLoop":false,"position":0,"parameters":{"useLoop":false,"xpath":"","wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"url":"https://www.zhihu.com","links":"https://www.zhihu.com","maxWaitTime":10,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"cookies":""}},{"id":2,"index":2,"parentId":0,"type":1,"option":8,"title":"循环采集数据","sequence":[3],"isInLoop":false,"position":1,"parameters":{"history":5,"tabIndex":-1,"useLoop":false,"xpath":"/html/body/div[1]/div[1]/main[1]/div[1]/div[2]/div[1]/div[1]/div[2]/div[1]/div[1]/div[1]/div/div[1]/div[1]/div[1]/div[1]/h2[1]/div[1]","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"loopType":1,"pathList":"","textList":"","code":"","waitTime":0,"exitCount":0,"exitElement":"//body","historyWait":2,"breakMode":0,"breakCode":"","breakCodeWaitTime":0,"allXPaths":["/html/body/div[1]/div[1]/main[1]/div[1]/div[2]/div[1]/div[1]/div[2]/div[1]/div[1]/div[1]/div[2]/div[1]/div[1]/div[1]/div[1]/h2[1]/div[1]","//div[contains(., '死刑执行前可以谎称肚')]","/html/body/div[last()-7]/div/main/div/div/div[last()-1]/div/div/div/div/div/div[last()-12]/div/div/div/div/h2/div"]}},{"id":3,"index":3,"parentId":2,"type":0,"option":3,"title":"提取数据","sequence":[],"isInLoop":true,"position":0,"parameters":{"history":5,"tabIndex":-1,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"clear":0,"newLine":1,"params":[{"nodeType":0,"contentType":0,"relative":true,"name":"参数1_文本","desc":"","extractType":0,"relativeXPath":"","allXPaths":"","exampleValues":[{"num":0,"value":"死刑执行前可以谎称肚子痛,想排泄粪便,籍此拖延时间吗?"}],"unique_index":"onlvi030w9jlpu5tjzb","iframe":false,"default":"","paraType":"text","recordASField":1,"beforeJS":"","beforeJSWaitTime":0,"JS":"","JSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"downloadPic":0}],"loopType":1}}]}
|
||||
{"id":229,"name":"知乎 - 有问题,就会有答案","url":"https://www.zhihu.com","links":"https://www.zhihu.com","create_time":"07/12/2023, 03:26:24","update_time":"2023-12-27 20:05:50","version":"0.6.0","saveThreshold":10,"quitWaitTime":6,"environment":0,"maximizeWindow":0,"maxViewLength":15,"recordLog":1,"outputFormat":"xlsx","saveName":"current_time","dataWriteMode":1,"inputExcel":"","startFromExit":0,"pauseKey":"t","containJudge":false,"browser":"chrome","removeDuplicate":0,"desc":"知了个乎","inputParameters":[{"id":0,"name":"urlList_0","nodeId":1,"nodeName":"打开网页","value":"https://www.zhihu.com","desc":"要采集的网址列表,多行以\\n分开","type":"text","exampleValue":"https://www.zhihu.com"},{"id":1,"name":"loopTimes_1","nodeId":4,"nodeName":"循环 - 单个元素","desc":"循环循环 - 单个元素执行的次数(0代表无限循环)","type":"int","exampleValue":0,"value":0}],"outputParameters":[{"id":0,"name":"参数1_文本","desc":"","type":"text","recordASField":1,"exampleValue":"死刑执行前可以谎称肚子痛,想排泄粪便,籍此拖延时间吗?"}],"graph":[{"index":0,"id":0,"parentId":0,"type":-1,"option":0,"title":"root","sequence":[1,4,2],"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0},"isInLoop":false},{"id":1,"index":1,"parentId":0,"type":0,"option":1,"title":"打开网页","sequence":[],"isInLoop":false,"position":0,"parameters":{"useLoop":false,"xpath":"","wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"url":"https://www.zhihu.com","links":"https://www.zhihu.com","maxWaitTime":10,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"cookies":""}},{"id":3,"index":2,"parentId":0,"type":1,"option":8,"title":"循环采集数据","sequence":[3],"isInLoop":false,"position":2,"parameters":{"history":5,"tabIndex":-1,"useLoop":false,"xpath":"/html/body/div[1]/div[1]/main[1]/div[1]/div[2]/div[1]/div[1]/div[2]/div[1]/div[1]/div[1]/div/div[1]/div[1]/div[1]/div[1]/h2[1]/div[1]","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"loopType":1,"pathList":"","textList":"","code":"","waitTime":0,"exitCount":0,"exitElement":"//body","historyWait":2,"breakMode":0,"breakCode":"","breakCodeWaitTime":0,"allXPaths":["/html/body/div[1]/div[1]/main[1]/div[1]/div[2]/div[1]/div[1]/div[2]/div[1]/div[1]/div[1]/div[2]/div[1]/div[1]/div[1]/div[1]/h2[1]/div[1]","//div[contains(., '死刑执行前可以谎称肚')]","/html/body/div[last()-7]/div/main/div/div/div[last()-1]/div/div/div/div/div/div[last()-12]/div/div/div/div/h2/div"]}},{"id":4,"index":3,"parentId":3,"type":0,"option":3,"title":"提取数据","sequence":[],"isInLoop":true,"position":0,"parameters":{"history":5,"tabIndex":-1,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"clear":0,"newLine":1,"params":[{"nodeType":0,"contentType":0,"relative":true,"name":"参数1_文本","desc":"","extractType":0,"relativeXPath":"","allXPaths":"","exampleValues":[{"num":0,"value":"死刑执行前可以谎称肚子痛,想排泄粪便,籍此拖延时间吗?"}],"unique_index":"onlvi030w9jlpu5tjzb","iframe":false,"default":"","paraType":"text","recordASField":1,"beforeJS":"","beforeJSWaitTime":0,"JS":"","JSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"downloadPic":0}],"loopType":1}},{"id":2,"index":4,"parentId":0,"type":1,"option":8,"title":"循环 - 单个元素","sequence":[],"isInLoop":false,"position":1,"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"loopType":0,"pathList":"","textList":"","code":"","waitTime":0,"exitCount":0,"exitElement":"//body","historyWait":2,"breakMode":0,"breakCode":"","breakCodeWaitTime":0,"skipCount":0}}]}
|
File diff suppressed because one or more lines are too long
1
ElectronJS/tasks/310.json
Normal file
1
ElectronJS/tasks/310.json
Normal file
File diff suppressed because one or more lines are too long
Some files were not shown because too many files have changed in this diff Show More
Loading…
x
Reference in New Issue
Block a user