🎨: V4预览版本

This commit is contained in:
Evil0ctal 2024-04-22 21:02:42 -07:00
parent b846416bc9
commit e8121533dd
87 changed files with 7993 additions and 4726 deletions

3
.gitignore vendored
View File

@ -129,4 +129,5 @@ dmypy.json
.pyre/
# pycharm
.idea
.idea
/app/api/endpoints/download/

214
LICENSE
View File

@ -1,21 +1,201 @@
MIT License
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
Copyright (c) 2021 Evil0ctal
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
1. Definitions.
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright [yyyy] [name of copyright owner]
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

View File

@ -1 +1 @@
web: python web_app.py --port=$PORT
app: python start.py --port=$PORT

177
README.md
View File

@ -31,8 +31,33 @@
</div>
## 🔊 本项目计划在V4.0.0版本进行重构。
感兴趣的给请加微信`Evil0ctal`备注github项目重构目前需要有爬虫/后端/全栈开发,如果你不具备相关技术栈也可以进来,主要是想着拉一个群然后大家可以在群里互相交流学习,不允许发广告以及违法的东西,纯粹交朋友和技术交流。
## 🔊 V4.0.0版本重构
> TODO:
- 移除了过时的bilibili代码需要有人重写。
- 群里有人想添加快手以及西瓜视频的解析。
- 自述文件已经过时,需要进行重写。
- 进行PyPi包制作
- config.yaml文件需要进行修整。
- 添加对用户主页的解析。
- iOS快捷指令需要更新兼容最新的API响应和路径。
- 桌面端下载器或浏览器插件有需要可以进行开发。
- 解决爬虫Cookie风控问题。
> 更改
- 将Pywebio作为FastAPI的子APP一起运行。
- 重写了抖音以及TikTok的接口感谢 [@johnserf-seed](https://github.com/Johnserf-Seed)
- 重写了文件下载的端点现在使用异步文件IO。
- 对所有端点进行了注解和演示值的添加。
- 整理项目文件结构。
> 备注
感兴趣一起写这个项目的给请加微信`Evil0ctal`备注github项目重构大家可以在群里互相交流学习不允许发广告以及违法的东西纯粹交朋友和技术交流。
> 私有接口服务
Discord: [TikHub Discord](https://discord.com/invite/aMEAS8Xsvz)
@ -42,20 +67,19 @@ Free Douyin/TikTok API: [TikHub Beta API](https://beta.tikhub.io/)
> 🚨如需使用私有服务器运行本项目,请参考部署方式[[Docker部署](./README.md#%E9%83%A8%E7%BD%B2%E6%96%B9%E5%BC%8F%E4%BA%8C-docker), [一键部署](./README.md#%E9%83%A8%E7%BD%B2%E6%96%B9%E5%BC%8F%E4%B8%80-linux)]
本项目是基于 [PyWebIO](https://github.com/pywebio/PyWebIO)[FastAPI](https://fastapi.tiangolo.com/)[AIOHTTP](https://docs.aiohttp.org/),快速异步的[抖音](https://www.douyin.com/)/[TikTok](https://www.tiktok.com/)/[Bilibili](https://www.bilibili.com)数据爬取工具并通过Web端实现在线批量解析以及下载无水印视频或图集数据爬取APIiOS快捷指令无水印下载等功能。你可以自己部署或改造本项目实现更多功能也可以在你的项目中直接调用[scraper.py](https://github.com/Evil0ctal/Douyin_TikTok_Download_API/blob/Stable/scraper.py)或安装现有的[pip包](https://pypi.org/project/douyin-tiktok-scraper/)作为解析库轻松爬取数据等.....
本项目是基于 [PyWebIO](https://github.com/pywebio/PyWebIO)[FastAPI](https://fastapi.tiangolo.com/)[HTTPX](https://www.python-httpx.org/),快速异步的[抖音](https://www.douyin.com/)/[TikTok](https://www.tiktok.com/)数据爬取工具并通过Web端实现在线批量解析以及下载无水印视频或图集数据爬取APIiOS快捷指令无水印下载等功能。你可以自己部署或改造本项目实现更多功能也可以在你的项目中直接调用[scraper.py](https://github.com/Evil0ctal/Douyin_TikTok_Download_API/blob/Stable/scraper.py)或安装现有的[pip包](https://pypi.org/project/douyin-tiktok-scraper/)作为解析库轻松爬取数据等.....
*一些简单的运用场景:*
*下载禁止下载的视频进行数据分析iOS无水印下载搭配[iOS自带的快捷指令APP](https://apps.apple.com/cn/app/%E5%BF%AB%E6%8D%B7%E6%8C%87%E4%BB%A4/id915249334)
配合本项目API实现应用内下载或读取剪贴板下载等.....*
## 🖥公共站点: 我很脆弱...请勿压测(·•᷄ࡇ•᷅
## 🖥演示站点: 我很脆弱...请勿压测(·•᷄ࡇ•᷅
> **TikHub-API:** 支持`Douyin|TikTok`用户主页爬取该作者[主页视频数据(去水印链接, 已点赞视频列表(权限需为公开), 视频评论数据, 背景音乐视频列表数据, 等等...), 详细信息请查看TikHub-API文档此外TikHub-API对比本项目API在抓取TikTok数据时TikHub-API速度更快。
🍔Web APP: [https://douyin.wtf/](https://douyin.wtf/)
🍟API Document: [https://api.douyin.wtf/docs](https://api.douyin.wtf/docs)
🍟API Document: [https://douyin.wtf/docs](https://douyin.wtf/docs)
🌭TikHub API Document: [https://api.tikhub.io/docs](https://api.tikhub.io/docs)
@ -67,52 +91,53 @@ Free Douyin/TikTok API: [TikHub Beta API](https://beta.tikhub.io/)
- [HFrost0/bilix](https://github.com/HFrost0/bilix)
- [Tairraos/TikDown - [需更新]](https://github.com/Tairraos/TikDown/)
🛸基于本项目的其他仓库
- [TikHubIO/TikHub_API_PyPi](https://github.com/TikHubIO/TikHub_API_PyPi)
- [Evil0ctal/Douyin_Tiktok_Scraper_PyPi](https://github.com/Evil0ctal/Douyin_Tiktok_Scraper_PyPi)
## ⚗️技术栈
* [web_app.py](https://github.com/Evil0ctal/Douyin_TikTok_Download_API/blob/main/web_app.py) - [PyWebIO](https://www.pyweb.io/)
* [web_api.py](https://github.com/Evil0ctal/Douyin_TikTok_Download_API/blob/main/web_api.py) - [FastAPI](https://fastapi.tiangolo.com/)
* [scraper.py](https://github.com/Evil0ctal/Douyin_TikTok_Download_API/blob/main/scraper.py) - [AIOHTTP](https://docs.aiohttp.org/)
* [/app/web](https://github.com/Evil0ctal/Douyin_TikTok_Download_API/blob/main/app/web) - [PyWebIO](https://www.pyweb.io/)
* [/app/api](https://github.com/Evil0ctal/Douyin_TikTok_Download_API/blob/main/app/api) - [FastAPI](https://fastapi.tiangolo.com/)
* [/crawlers](https://github.com/Evil0ctal/Douyin_TikTok_Download_API/blob/main/crawlers) - [HTTPX](https://www.python-httpx.org/)
> ***scraper.py:***
> ***/crawlers***
- 向[Douyin|TikTok]的API提交请求并取回数据处理后返回字典(dict),支持异步。
- 向不同平台的API提交请求并取回数据处理后返回字典(dict),支持异步。
> ***web_api.py:***
> ***/app/api***
- 获得请求参数并使用`Scraper()`类处理数据后以JSON形式返回视频下载配合iOS快捷指令实现快速调用支持异步。
- 获得请求参数并使用`Crawlers`相关类处理数据后以JSON形式返回视频下载配合iOS快捷指令实现快速调用支持异步。
> ***web_app.py:***
> ***/app/web***
- 为`web_api.py`以及`scraper.py`制作的简易Web程序将网页输入的值进行处理后使用`Scraper()`类处理并配合`web_api.py`的接口输出在网页上(类似前后端分离)
- 使用`PyWebIO`制作的简易Web程序将网页输入的值进行处理后使用`Crawlers`相关类处理接口输出相关数据在网页上。
***以上文件的参数大多可在[config.ini](https://github.com/Evil0ctal/Douyin_TikTok_Download_API/blob/main/config.ini)中进行修改***
***以上文件的参数大多可在对应的`config.yaml`中进行修改***
## 💡项目文件结构
```
.
└── Douyin_TikTok_Download_API/
├── /static -> (PyWebIO static resources)
├── web_app.py -> (Web APP)
├── web_api.py -> (API)
├── scraper.py -> (Parsing library)
├── config.ini -> (Configuration file)
├── install.sh -> (Installation bash script)
./Douyin_TikTok_Download_API
├─app
│ ├─api
│ │ ├─endpoints
│ │ └─models
│ ├─download
│ └─web
│ └─views
└─crawlers
├─douyin
│ └─web
├─hybrid
├─tiktok
│ ├─app
│ └─web
└─utils
```
## ✨功能:
- 抖音(抖音海外版: TikTok视频/图片解析
- Bilibili视频解析
- 西瓜视频解析
- 快手视频解析
- 抖音Web大多数API
- TikTok Web大多数API
- 网页端批量解析(支持抖音/TikTok混合提交)
- 网页端解析结果页批量下载无水印视频(V3.X以上版本移除请自行部署V2.X版本)
- 在线下载视频或图集。
- API调用获取链接数据
- 制作[pip包](https://pypi.org/project/douyin-tiktok-scraper/)方便快速导入你的项目
- [iOS快捷指令快速调用API](https://apps.apple.com/cn/app/%E5%BF%AB%E6%8D%B7%E6%8C%87%E4%BB%A4/id915249334)实现应用内下载无水印视频/图集
@ -121,17 +146,7 @@ Free Douyin/TikTok API: [TikHub Beta API](https://beta.tikhub.io/)
---
## 🤦‍待办清单:
> 💡欢迎提出建议或直接提交PR至此仓库 ♪(・ω・)ノ)
- [ ] 编写一个桌面端的异步下载器实现本地批量下载
- [ ] TikHub-API添加对hash_tag页面的数据爬取 [#101](https://github.com/Evil0ctal/Douyin_TikTok_Download_API/issues/101)
- [ ] 对其他短视频平台添加支持,如:抖音火山版,快手,西瓜视频,哔哩哔哩
---
## 📦调用解析库:
## 📦调用解析库(待更新):
> 💡PyPi[https://pypi.org/project/douyin-tiktok-scraper/](https://pypi.org/project/douyin-tiktok-scraper/)
@ -225,40 +240,18 @@ https://www.tiktok.com/@evil0ctal/video/7156033831819037994
## 🛰API文档
> 💡提示也可以在web_api.py的代码注释中查看接口文档
***API文档***
本地:[http://localhost:8000/docs](http://localhost:8000/docs)
本地:[http://localhost:8000/docs](http://localhost:80/docs)
在线:[https://api.douyin.wtf/docs](https://api.douyin.wtf/docs)
***TikHub-API文档***
在线:[https://api.tikhub.io/docs](https://api.tikhub.io/docs)
***API演示***
- 爬取视频数据(TikTok或Douyin混合解析)
`https://api.douyin.wtf/api?url=[视频链接/Video URL]&minimal=false`
`https://api.douyin.wtf/api/hybrid/video_data?url=[视频链接/Video URL]&minimal=false`
- 下载视频/图集(TikTok或Douyin混合解析)
`https://api.douyin.wtf/download?url=[视频链接/Video URL]&prefix=true&watermark=false`
- 替换域名下载视频/图集
```
[抖音]
原始链接:
https://www.douyin.com/video/7159502929156705567
替换域名:
https://api.douyin.wtf/video/7159502929156705567
# 返回无水印视频下载响应
[TikTok]
original link:
https://www.tiktok.com/@evil0ctal/video/7156033831819037994
Replace Domain:
https://api.douyin.wtf/@evil0ctal/video/7156033831819037994
# Return No Watermark Video Download Response
```
`https://api.douyin.wtf/api/download?url=[视频链接/Video URL]&prefix=true&with_watermark=false`
***更多演示请查看文档内容......***
@ -282,53 +275,17 @@ https://api.douyin.wtf/@evil0ctal/video/7156033831819037994
wget -O install.sh https://raw.githubusercontent.com/Evil0ctal/Douyin_TikTok_Download_API/main/bash/install.sh && sudo bash install.sh
```
- 运行Bash脚本后会自动使用[config.py](https://github.com/Evil0ctal/Douyin_TikTok_Download_API/blob/main/config.py)来帮助你修改[config.ini](https://github.com/Evil0ctal/Douyin_TikTok_Download_API/blob/main/config.ini)
```console
Please edit config.ini, all input must be numbers!
Default API port: 8000
If you want use different port input new API port here: 80
Use new port for web_api.py: 80
Default API rate limit: 10/minute
If you want use different rate limit input new rate limit here: 60
Use new rate limit: 60/minute
Default App port: 80
If you want use different port input new App port here: 8080
Use new port: 8080
```
- 随后脚本会询问你要启动的服务
api单独启动`web_api.py`
web单独启动`web_app.py`
all同时启动`web_api.py`和`web_app.py`
```console
Run API or Web? [api/web/all/quit] api
Do you want to start the api service when system boot? [y/n] y
Created symlink /etc/systemd/system/multi-user.target.wants/web_api.service → /etc/systemd/system/web_api.service.
API service will start when system boot!
Starting API...
API is running! You can visit http://your_ip:port
You can stop the api service by running: systemctl stop web_api.service
```
> 开启/停止服务
- web服务`systemctl start/stop web_app.service`
- api服务`systemctl start/stop web_api.service`
- `systemctl start/stop Douyin_TikTok_Download_API.service`
> 开启/关闭开机自动运行
- web服务`systemctl enable/disable web_app.service`
- api服务`systemctl enable/disable web_api.service`
- `systemctl enable/disable Douyin_TikTok_Download_API.service`
> 更新项目
- `cd /www/wwwroot/Douyin_TikTok_Download_API/bash`
- `sudo sh update.sh`
- `cd /www/wwwroot/Douyin_TikTok_Download_API/bash && sudo bash update.sh`
## 💽部署(方式二 Docker)
@ -419,5 +376,3 @@ Web main interface:
> Start: 2021/11/06
> GitHub: [@Evil0ctal](https://github.com/Evil0ctal)
> Contact: Evil0ctal1985@gmail.com

File diff suppressed because one or more lines are too long

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,120 @@
import os
import zipfile
import aiofiles
import httpx
import yaml
from fastapi import APIRouter, Request # 导入FastAPI组件
from starlette.responses import FileResponse
from app.api.models.APIResponseModel import ErrorResponseModel # 导入响应模型
from crawlers.hybrid.hybrid_crawler import HybridCrawler # 导入混合数据爬虫
router = APIRouter()
HybridCrawler = HybridCrawler()
# 读取上级再上级目录的配置文件
config_path = os.path.join(os.path.dirname(os.path.dirname(os.path.dirname(os.path.dirname(__file__)))), 'config.yaml')
with open(config_path, 'r', encoding='utf-8') as file:
config = yaml.safe_load(file)
async def fetch_data(url: str):
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36'
}
async with httpx.AsyncClient() as client:
response = await client.get(url, headers=headers)
response.raise_for_status() # 确保响应是成功的
return response
@router.get("/download", summary="在线下载抖音|TikTok视频/图片/Online download Douyin|TikTok video/image")
async def download_file_hybrid(request: Request,
url: str, prefix: bool = True, with_watermark: bool = False):
# 是否开启此端点/Whether to enable this endpoint
if not config["Download_Switch"]:
code = 400
message = "Download endpoint is disabled."
return ErrorResponseModel(code=code, message=message, router=request.url.path, params=dict(request.query_params))
# 开始解析数据/Start parsing data
try:
data = await HybridCrawler.hybrid_parsing_single_video(url, minimal=True)
except Exception as e:
code = 400
return ErrorResponseModel(code=code, message=str(e), router=request.url.path, params=dict(request.query_params))
# 开始下载文件/Start downloading files
try:
data_type = data.get('type')
platform = data.get('platform')
aweme_id = data.get('aweme_id')
file_prefix = config.get("Download_File_Prefix") if prefix else ''
download_path = os.path.join(config.get("Download_Path"), f"{platform}_{data_type}")
# 确保目录存在/Ensure the directory exists
os.makedirs(download_path, exist_ok=True)
# 下载视频文件/Download video file
if data_type == 'video':
file_name = f"{file_prefix}{platform}_{aweme_id}.mp4" if not with_watermark else f"{file_prefix}{platform}_{aweme_id}_watermark.mp4"
url = data.get('video_data').get('nwm_video_url_HQ') if not with_watermark else data.get('video_data').get(
'wm_video_url_HQ')
file_path = os.path.join(download_path, file_name)
# 判断文件是否存在,存在就直接返回
if os.path.exists(file_path):
return FileResponse(path=file_path, media_type='video/mp4', filename=file_name)
# 获取视频文件
response = await fetch_data(url)
# 保存文件
async with aiofiles.open(file_path, 'wb') as out_file:
await out_file.write(response.content)
# 返回文件内容
return FileResponse(path=file_path, filename=file_name, media_type="video/mp4")
# 下载图片文件/Download image file
elif data_type == 'image':
# 压缩文件属性/Compress file properties
zip_file_name = f"{file_prefix}{platform}_{aweme_id}_images.zip" if not with_watermark else f"{file_prefix}{platform}_{aweme_id}_images_watermark.zip"
zip_file_path = os.path.join(download_path, zip_file_name)
# 判断文件是否存在,存在就直接返回、
if os.path.exists(zip_file_path):
return FileResponse(path=zip_file_path, filename=zip_file_name, media_type="application/zip")
# 获取图片文件/Get image file
urls = data.get('image_data').get('no_watermark_image_list') if not with_watermark else data.get(
'image_data').get('watermark_image_list')
image_file_list = []
for url in urls:
# 请求图片文件/Request image file
response = await fetch_data(url)
index = int(urls.index(url))
content_type = response.headers.get('content-type')
file_format = content_type.split('/')[1]
file_name = f"{file_prefix}{platform}_{aweme_id}_{index + 1}.{file_format}" if not with_watermark else f"{file_prefix}{platform}_{aweme_id}_{index + 1}_watermark.{file_format}"
file_path = os.path.join(download_path, file_name)
image_file_list.append(file_path)
# 保存文件/Save file
async with aiofiles.open(file_path, 'wb') as out_file:
await out_file.write(response.content)
# 压缩文件/Compress file
with zipfile.ZipFile(zip_file_path, 'w') as zip_file:
for image_file in image_file_list:
zip_file.write(image_file, os.path.basename(image_file))
# 返回压缩文件/Return compressed file
return FileResponse(path=zip_file_path, filename=zip_file_name, media_type="application/zip")
# 异常处理/Exception handling
except Exception as e:
code = 400
return ErrorResponseModel(code=code, message=str(e), router=request.url.path, params=dict(request.query_params))

View File

@ -0,0 +1,53 @@
import asyncio
from fastapi import APIRouter, Body, Query, Request, HTTPException # 导入FastAPI组件
from app.api.models.APIResponseModel import ResponseModel, ErrorResponseModel # 导入响应模型
# 爬虫/Crawler
from crawlers.hybrid.hybrid_crawler import HybridCrawler # 导入混合爬虫
HybridCrawler = HybridCrawler() # 实例化混合爬虫
router = APIRouter()
@router.get("/video_data", response_model=ResponseModel, tags=["Hybrid-API"],
summary="混合解析单一视频接口/Hybrid parsing single video endpoint")
async def hybrid_parsing_single_video(request: Request,
url: str = Query(example="https://v.douyin.com/L4FJNR3/"),
minimal: bool = Query(default=False)):
"""
# [中文]
### 用途:
- 该接口用于解析抖音/TikTok单一视频的数据
### 参数:
- `url`: 视频链接分享链接分享文本
### 返回:
- `data`: 视频数据
# [English]
### Purpose:
- This endpoint is used to parse data of a single Douyin/TikTok video.
### Parameters:
- `url`: Video link, share link, or share text.
### Returns:
- `data`: Video data.
# [Example]
url = "https://v.douyin.com/L4FJNR3/"
"""
try:
# 解析视频/Parse video
data = await HybridCrawler.hybrid_parsing_single_video(url=url, minimal=minimal)
# 返回数据/Return data
return ResponseModel(code=200,
router=request.url.path,
data=data)
except Exception as e:
status_code = 400
detail = ErrorResponseModel(code=status_code,
router=request.url.path,
params=dict(request.query_params),
)
raise HTTPException(status_code=status_code, detail=detail.dict())

View File

@ -0,0 +1,24 @@
import os
import yaml
from fastapi import APIRouter
from app.api.models.APIResponseModel import iOS_Shortcut
# 读取上级再上级目录的配置文件
config_path = os.path.join(os.path.dirname(os.path.dirname(os.path.dirname(os.path.dirname(__file__)))), 'config.yaml')
with open(config_path, 'r', encoding='utf-8') as file:
config = yaml.safe_load(file)
router = APIRouter()
@router.get("/shortcut", response_model=iOS_Shortcut, summary="用于iOS快捷指令的版本更新信息/Version update information for iOS shortcuts")
async def get_shortcut():
shortcut_config = config["iOS_Shortcut"]
version = shortcut_config["iOS_Shortcut_Version"]
update = shortcut_config['iOS_Shortcut_Update_Time']
link = shortcut_config['iOS_Shortcut_Link']
link_en = shortcut_config['iOS_Shortcut_Link_EN']
note = shortcut_config['iOS_Shortcut_Update_Note']
note_en = shortcut_config['iOS_Shortcut_Update_Note_EN']
return iOS_Shortcut(version=str(version), update=update, link=link, link_en=link_en, note=note, note_en=note_en)

View File

@ -0,0 +1,45 @@
from fastapi import APIRouter, Query, Request, HTTPException # 导入FastAPI组件
from app.api.models.APIResponseModel import ResponseModel, ErrorResponseModel # 导入响应模型
from crawlers.tiktok.app.app_crawler import TikTokAPPCrawler # 导入APP爬虫
router = APIRouter()
TikTokAPPCrawler = TikTokAPPCrawler()
# 获取单个作品数据
@router.get("/fetch_one_video", response_model=ResponseModel, summary="获取单个作品数据/Get single video data")
async def fetch_one_video(request: Request,
aweme_id: str = Query(example="7350810998023949599", description="作品id/Video id")):
"""
# [中文]
### 用途:
- 获取单个作品数据
### 参数:
- aweme_id: 作品id
### 返回:
- 作品数据
# [English]
### Purpose:
- Get single video data
### Parameters:
- aweme_id: Video id
### Return:
- Video data
# [示例/Example]
aweme_id = "7350810998023949599"
"""
try:
data = await TikTokAPPCrawler.fetch_one_video(aweme_id)
return ResponseModel(code=200,
router=request.url.path,
data=data)
except Exception as e:
status_code = 400
detail = ErrorResponseModel(code=status_code,
router=request.url.path,
params=dict(request.query_params),
)
raise HTTPException(status_code=status_code, detail=detail.dict())

View File

@ -0,0 +1,951 @@
from typing import List
from fastapi import APIRouter, Query, Body, Request, HTTPException # 导入FastAPI组件
from app.api.models.APIResponseModel import ResponseModel, ErrorResponseModel # 导入响应模型
from crawlers.tiktok.web.web_crawler import TikTokWebCrawler # 导入TikTokWebCrawler类
router = APIRouter()
TikTokWebCrawler = TikTokWebCrawler()
# 获取单个作品数据
@router.get("/fetch_one_video",
response_model=ResponseModel,
summary="获取单个作品数据/Get single video data")
async def fetch_one_video(request: Request,
itemId: str = Query(example="7339393672959757570", description="作品id/Video id")):
"""
# [中文]
### 用途:
- 获取单个作品数据
### 参数:
- itemId: 作品id
### 返回:
- 作品数据
# [English]
### Purpose:
- Get single video data
### Parameters:
- itemId: Video id
### Return:
- Video data
# [示例/Example]
itemId = "7339393672959757570"
"""
try:
data = await TikTokWebCrawler.fetch_one_video(itemId)
return ResponseModel(code=200,
router=request.url.path,
data=data)
except Exception as e:
status_code = 400
detail = ErrorResponseModel(code=status_code,
router=request.url.path,
params=dict(request.query_params),
)
raise HTTPException(status_code=status_code, detail=detail.dict())
# 获取用户的个人信息
@router.get("/fetch_user_profile",
response_model=ResponseModel,
summary="获取用户的个人信息/Get user profile")
async def fetch_user_profile(request: Request,
uniqueId: str = Query(example="tiktok", description="用户uniqueId/User uniqueId"),
secUid: str = Query(default="", description="用户secUid/User secUid"),):
"""
# [中文]
### 用途:
- 获取用户的个人信息
### 参数:
- secUid: 用户secUid
- uniqueId: 用户uniqueId
- secUid和uniqueId至少提供一个, 优先使用uniqueId, 也就是用户主页的链接中的用户名
### 返回:
- 用户的个人信息
# [English]
### Purpose:
- Get user profile
### Parameters:
- secUid: User secUid
- uniqueId: User uniqueId
- At least one of secUid and uniqueId is provided, and uniqueId is preferred, that is, the username in the user's homepage link.
### Return:
- User profile
# [示例/Example]
secUid = "MS4wLjABAAAAv7iSuuXDJGDvJkmH_vz1qkDZYo1apxgzaxdBSeIuPiM"
uniqueId = "tiktok"
"""
try:
data = await TikTokWebCrawler.fetch_user_profile(secUid, uniqueId)
return ResponseModel(code=200,
router=request.url.path,
data=data)
except Exception as e:
status_code = 400
detail = ErrorResponseModel(code=status_code,
router=request.url.path,
params=dict(request.query_params),
)
raise HTTPException(status_code=status_code, detail=detail.dict())
# 获取用户的作品列表
@router.get("/fetch_user_post",
response_model=ResponseModel,
summary="获取用户的作品列表/Get user posts")
async def fetch_user_post(request: Request,
secUid: str = Query(example="MS4wLjABAAAAv7iSuuXDJGDvJkmH_vz1qkDZYo1apxgzaxdBSeIuPiM",
description="用户secUid/User secUid"),
cursor: int = Query(default=0, description="翻页游标/Page cursor"),
count: int = Query(default=35, description="每页数量/Number per page"),
coverFormat: int = Query(default=2, description="封面格式/Cover format")):
"""
# [中文]
### 用途:
- 获取用户的作品列表
### 参数:
- secUid: 用户secUid
- cursor: 翻页游标
- count: 每页数量
- coverFormat: 封面格式
### 返回:
- 用户的作品列表
# [English]
### Purpose:
- Get user posts
### Parameters:
- secUid: User secUid
- cursor: Page cursor
- count: Number per page
- coverFormat: Cover format
### Return:
- User posts
# [示例/Example]
secUid = "MS4wLjABAAAAv7iSuuXDJGDvJkmH_vz1qkDZYo1apxgzaxdBSeIuPiM"
cursor = 0
count = 35
coverFormat = 2
"""
try:
data = await TikTokWebCrawler.fetch_user_post(secUid, cursor, count, coverFormat)
return ResponseModel(code=200,
router=request.url.path,
data=data)
except Exception as e:
status_code = 400
detail = ErrorResponseModel(code=status_code,
router=request.url.path,
params=dict(request.query_params),
)
raise HTTPException(status_code=status_code, detail=detail.dict())
# 获取用户的点赞列表
@router.get("/fetch_user_like",
response_model=ResponseModel,
summary="获取用户的点赞列表/Get user likes")
async def fetch_user_like(request: Request,
secUid: str = Query(
example="MS4wLjABAAAAq1iRXNduFZpY301UkVpJ1eQT60_NiWS9QQSeNqmNQEDJp0pOF8cpleNEdiJx5_IU",
description="用户secUid/User secUid"),
cursor: int = Query(default=0, description="翻页游标/Page cursor"),
count: int = Query(default=35, description="每页数量/Number per page"),
coverFormat: int = Query(default=2, description="封面格式/Cover format")):
"""
# [中文]
### 用途:
- 获取用户的点赞列表
- 注意: 该接口需要用户点赞列表为公开状态
### 参数:
- secUid: 用户secUid
- cursor: 翻页游标
- count: 每页数量
- coverFormat: 封面格式
### 返回:
- 用户的点赞列表
# [English]
### Purpose:
- Get user likes
- Note: This interface requires that the user's like list be public
### Parameters:
- secUid: User secUid
- cursor: Page cursor
- count: Number per page
- coverFormat: Cover format
### Return:
- User likes
# [示例/Example]
secUid = "MS4wLjABAAAAq1iRXNduFZpY301UkVpJ1eQT60_NiWS9QQSeNqmNQEDJp0pOF8cpleNEdiJx5_IU"
cursor = 0
count = 35
coverFormat = 2
"""
try:
data = await TikTokWebCrawler.fetch_user_like(secUid, cursor, count, coverFormat)
return ResponseModel(code=200,
router=request.url.path,
data=data)
except Exception as e:
status_code = 400
detail = ErrorResponseModel(code=status_code,
router=request.url.path,
params=dict(request.query_params),
)
raise HTTPException(status_code=status_code, detail=detail.dict())
# 获取用户的收藏列表
@router.get("/fetch_user_collect",
response_model=ResponseModel,
summary="获取用户的收藏列表/Get user favorites")
async def fetch_user_collect(request: Request,
cookie: str = Query(example="Your_Cookie", description="用户cookie/User cookie"),
secUid: str = Query(example="Your_SecUid", description="用户secUid/User secUid"),
cursor: int = Query(default=0, description="翻页游标/Page cursor"),
count: int = Query(default=30, description="每页数量/Number per page"),
coverFormat: int = Query(default=2, description="封面格式/Cover format")):
"""
# [中文]
### 用途:
- 获取用户的收藏列表
- 注意: 该接口目前只能获取自己的收藏列表需要提供自己账号的cookie
### 参数:
- cookie: 用户cookie
- secUid: 用户secUid
- cursor: 翻页游标
- count: 每页数量
- coverFormat: 封面格式
### 返回:
- 用户的收藏列表
# [English]
### Purpose:
- Get user favorites
- Note: This interface can currently only get your own favorites list, you need to provide your account cookie.
### Parameters:
- cookie: User cookie
- secUid: User secUid
- cursor: Page cursor
- count: Number per page
- coverFormat: Cover format
### Return:
- User favorites
# [示例/Example]
cookie = "Your_Cookie"
secUid = "Your_SecUid"
cursor = 0
count = 30
coverFormat = 2
"""
try:
data = await TikTokWebCrawler.fetch_user_collect(cookie, secUid, cursor, count, coverFormat)
return ResponseModel(code=200,
router=request.url.path,
data=data)
except Exception as e:
status_code = 400
detail = ErrorResponseModel(code=status_code,
router=request.url.path,
params=dict(request.query_params),
)
raise HTTPException(status_code=status_code, detail=detail.dict())
# 获取用户的播放列表
@router.get("/fetch_user_play_list",
response_model=ResponseModel,
summary="获取用户的播放列表/Get user play list")
async def fetch_user_play_list(request: Request,
secUid: str = Query(example="MS4wLjABAAAAv7iSuuXDJGDvJkmH_vz1qkDZYo1apxgzaxdBSeIuPiM",
description="用户secUid/User secUid"),
cursor: int = Query(default=0, description="翻页游标/Page cursor"),
count: int = Query(default=30, description="每页数量/Number per page")):
"""
# [中文]
### 用途:
- 获取用户的播放列表
### 参数:
- secUid: 用户secUid
- cursor: 翻页游标
- count: 每页数量
### 返回:
- 用户的播放列表
# [English]
### Purpose:
- Get user play list
### Parameters:
- secUid: User secUid
- cursor: Page cursor
- count: Number per page
### Return:
- User play list
# [示例/Eample]
secUid = "MS4wLjABAAAAv7iSuuXDJGDvJkmH_vz1qkDZYo1apxgzaxdBSeIuPiM"
cursor = 0
count = 30
"""
try:
data = await TikTokWebCrawler.fetch_user_play_list(secUid, cursor, count)
return ResponseModel(code=200,
router=request.url.path,
data=data)
except Exception as e:
status_code = 400
detail = ErrorResponseModel(code=status_code,
router=request.url.path,
params=dict(request.query_params),
)
raise HTTPException(status_code=status_code, detail=detail.dict())
# 获取用户的合辑列表
@router.get("/fetch_user_mix",
response_model=ResponseModel,
summary="获取用户的合辑列表/Get user mix list")
async def fetch_user_mix(request: Request,
mixId: str = Query(example="7101538765474106158",
description="合辑id/Mix id"),
cursor: int = Query(default=0, description="翻页游标/Page cursor"),
count: int = Query(default=30, description="每页数量/Number per page")):
"""
# [中文]
### 用途:
- 获取用户的合辑列表
### 参数:
- mixId: 合辑id
- cursor: 翻页游标
- count: 每页数量
### 返回:
- 用户的合辑列表
# [English]
### Purpose:
- Get user mix list
### Parameters:
- mixId: Mix id
- cursor: Page cursor
- count: Number per page
### Return:
- User mix list
# [示例/Eample]
mixId = "7101538765474106158"
cursor = 0
count = 30
"""
try:
data = await TikTokWebCrawler.fetch_user_mix(mixId, cursor, count)
return ResponseModel(code=200,
router=request.url.path,
data=data)
except Exception as e:
status_code = 400
detail = ErrorResponseModel(code=status_code,
router=request.url.path,
params=dict(request.query_params),
)
raise HTTPException(status_code=status_code, detail=detail.dict())
# 获取作品的评论列表
@router.get("/fetch_post_comment",
response_model=ResponseModel,
summary="获取作品的评论列表/Get video comments")
async def fetch_post_comment(request: Request,
aweme_id: str = Query(example="7304809083817774382", description="作品id/Video id"),
cursor: int = Query(default=0, description="翻页游标/Page cursor"),
count: int = Query(default=20, description="每页数量/Number per page"),
current_region: str = Query(default="", description="当前地区/Current region")):
"""
# [中文]
### 用途:
- 获取作品的评论列表
### 参数:
- aweme_id: 作品id
- cursor: 翻页游标
- count: 每页数量
- current_region: 当前地区默认为空
### 返回:
- 作品的评论列表
# [English]
### Purpose:
- Get video comments
### Parameters:
- aweme_id: Video id
- cursor: Page cursor
- count: Number per page
- current_region: Current region, default is empty.
### Return:
- Video comments
# [示例/Eample]
aweme_id = "7304809083817774382"
cursor = 0
count = 20
current_region = ""
"""
try:
data = await TikTokWebCrawler.fetch_post_comment(aweme_id, cursor, count, current_region)
return ResponseModel(code=200,
router=request.url.path,
data=data)
except Exception as e:
status_code = 400
detail = ErrorResponseModel(code=status_code,
router=request.url.path,
params=dict(request.query_params),
)
raise HTTPException(status_code=status_code, detail=detail.dict())
# 获取作品的评论回复列表
@router.get("/fetch_post_comment_reply",
response_model=ResponseModel,
summary="获取作品的评论回复列表/Get video comment replies")
async def fetch_post_comment_reply(request: Request,
item_id: str = Query(example="7304809083817774382", description="作品id/Video id"),
comment_id: str = Query(example="7304877760886588191",
description="评论id/Comment id"),
cursor: int = Query(default=0, description="翻页游标/Page cursor"),
count: int = Query(default=20, description="每页数量/Number per page"),
current_region: str = Query(default="", description="当前地区/Current region")):
"""
# [中文]
### 用途:
- 获取作品的评论回复列表
### 参数:
- item_id: 作品id
- comment_id: 评论id
- cursor: 翻页游标
- count: 每页数量
- current_region: 当前地区默认为空
### 返回:
- 作品的评论回复列表
# [English]
### Purpose:
- Get video comment replies
### Parameters:
- item_id: Video id
- comment_id: Comment id
- cursor: Page cursor
- count: Number per page
- current_region: Current region, default is empty.
### Return:
- Video comment replies
# [示例/Eample]
item_id = "7304809083817774382"
comment_id = "7304877760886588191"
cursor = 0
count = 20
current_region = ""
"""
try:
data = await TikTokWebCrawler.fetch_post_comment_reply(item_id, comment_id, cursor, count, current_region)
return ResponseModel(code=200,
router=request.url.path,
data=data)
except Exception as e:
status_code = 400
detail = ErrorResponseModel(code=status_code,
router=request.url.path,
params=dict(request.query_params),
)
raise HTTPException(status_code=status_code, detail=detail.dict())
# 获取用户的粉丝列表
@router.get("/fetch_user_fans",
response_model=ResponseModel,
summary="获取用户的粉丝列表/Get user followers")
async def fetch_user_fans(request: Request,
secUid: str = Query(example="MS4wLjABAAAAv7iSuuXDJGDvJkmH_vz1qkDZYo1apxgzaxdBSeIuPiM",
description="用户secUid/User secUid"),
count: int = Query(default=30, description="每页数量/Number per page"),
maxCursor: int = Query(default=0, description="最大游标/Max cursor"),
minCursor: int = Query(default=0, description="最小游标/Min cursor")):
"""
# [中文]
### 用途:
- 获取用户的粉丝列表
### 参数:
- secUid: 用户secUid
- count: 每页数量
- maxCursor: 最大游标
- minCursor: 最小游标
### 返回:
- 用户的粉丝列表
# [English]
### Purpose:
- Get user followers
### Parameters:
- secUid: User secUid
- count: Number per page
- maxCursor: Max cursor
- minCursor: Min cursor
### Return:
- User followers
# [示例/Example]
secUid = "MS4wLjABAAAAv7iSuuXDJGDvJkmH_vz1qkDZYo1apxgzaxdBSeIuPiM"
count = 30
maxCursor = 0
minCursor = 0
"""
try:
data = await TikTokWebCrawler.fetch_user_fans(secUid, count, maxCursor, minCursor)
return ResponseModel(code=200,
router=request.url.path,
data=data)
except Exception as e:
status_code = 400
detail = ErrorResponseModel(code=status_code,
router=request.url.path,
params=dict(request.query_params),
)
raise HTTPException(status_code=status_code, detail=detail.dict())
# 获取用户的关注列表
@router.get("/fetch_user_follow",
response_model=ResponseModel,
summary="获取用户的关注列表/Get user followings")
async def fetch_user_follow(request: Request,
secUid: str = Query(example="MS4wLjABAAAAv7iSuuXDJGDvJkmH_vz1qkDZYo1apxgzaxdBSeIuPiM",
description="用户secUid/User secUid"),
count: int = Query(default=30, description="每页数量/Number per page"),
maxCursor: int = Query(default=0, description="最大游标/Max cursor"),
minCursor: int = Query(default=0, description="最小游标/Min cursor")):
"""
# [中文]
### 用途:
- 获取用户的关注列表
### 参数:
- secUid: 用户secUid
- count: 每页数量
- maxCursor: 最大游标
- minCursor: 最小游标
### 返回:
- 用户的关注列表
# [English]
### Purpose:
- Get user followings
### Parameters:
- secUid: User secUid
- count: Number per page
- maxCursor: Max cursor
- minCursor: Min cursor
### Return:
- User followings
# [示例/Example]
secUid = "MS4wLjABAAAAv7iSuuXDJGDvJkmH_vz1qkDZYo1apxgzaxdBSeIuPiM"
count = 30
maxCursor = 0
minCursor = 0
"""
try:
data = await TikTokWebCrawler.fetch_user_follow(secUid, count, maxCursor, minCursor)
return ResponseModel(code=200,
router=request.url.path,
data=data)
except Exception as e:
status_code = 400
detail = ErrorResponseModel(code=status_code,
router=request.url.path,
params=dict(request.query_params),
)
raise HTTPException(status_code=status_code, detail=detail.dict())
"""-------------------------------------------------------utils接口列表-------------------------------------------------------"""
# 生成真实msToken
@router.get("/generate_real_msToken",
response_model=ResponseModel,
summary="生成真实msToken/Generate real msToken")
async def generate_real_msToken(request: Request):
"""
# [中文]
### 用途:
- 生成真实msToken
### 返回:
- 真实msToken
# [English]
### Purpose:
- Generate real msToken
### Return:
- Real msToken
"""
try:
data = await TikTokWebCrawler.fetch_real_msToken()
return ResponseModel(code=200,
router=request.url.path,
data=data)
except Exception as e:
status_code = 400
detail = ErrorResponseModel(code=status_code,
router=request.url.path,
params=dict(request.query_params),
)
raise HTTPException(status_code=status_code, detail=detail.dict())
# 生成ttwid
@router.get("/generate_ttwid",
response_model=ResponseModel,
summary="生成ttwid/Generate ttwid")
async def generate_ttwid(request: Request,
cookie: str = Query(example="Your_Cookie", description="用户cookie/User cookie")):
"""
# [中文]
### 用途:
- 生成ttwid
### 参数:
- cookie: 用户cookie
### 返回:
- ttwid
# [English]
### Purpose:
- Generate ttwid
### Parameters:
- cookie: User cookie
### Return:
- ttwid
# [示例/Example]
cookie = "Your_Cookie"
"""
try:
data = await TikTokWebCrawler.fetch_ttwid(cookie)
return ResponseModel(code=200,
router=request.url.path,
data=data)
except Exception as e:
status_code = 400
detail = ErrorResponseModel(code=status_code,
router=request.url.path,
params=dict(request.query_params),
)
raise HTTPException(status_code=status_code, detail=detail.dict())
# 生成xbogus
@router.get("/generate_xbogus",
response_model=ResponseModel,
summary="生成xbogus/Generate xbogus")
async def generate_xbogus(request: Request,
url: str = Query(
example="https://www.tiktok.com/api/item/detail/?WebIdLastTime=1712665533&aid=1988&app_language=en&app_name=tiktok_web&browser_language=en-US&browser_name=Mozilla&browser_online=true&browser_platform=Win32&browser_version=5.0%20%28Windows%29&channel=tiktok_web&cookie_enabled=true&device_id=7349090360347690538&device_platform=web_pc&focus_state=true&from_page=user&history_len=4&is_fullscreen=false&is_page_visible=true&language=en&os=windows&priority_region=US&referer=&region=US&root_referer=https%3A%2F%2Fwww.tiktok.com%2F&screen_height=1080&screen_width=1920&webcast_language=en&tz_name=America%2FTijuana&msToken=AYFCEapCLbMrS8uTLBoYdUMeeVLbCdFQ_QF_-OcjzJw1CPr4JQhWUtagy0k4a9IITAqi5Qxr2Vdh9mgCbyGxTnvWLa4ZVY6IiSf6lcST-tr0IXfl-r_ZTpzvWDoQfqOVsWCTlSNkhAwB-tap5g==&itemId=7339393672959757570",
description="未签名的API URL/Unsigned API URL"),
user_agent: str = Query(
example="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3",
description="用户浏览器User-Agent/User browser User-Agent")):
"""
# [中文]
### 用途:
- 生成xbogus
### 参数:
- url: 未签名的API URL
- user_agent: 用户浏览器User-Agent
### 返回:
- xbogus
# [English]
### Purpose:
- Generate xbogus
### Parameters:
- url: Unsigned API URL
- user_agent: User browser User-Agent
### Return:
- xbogus
# [示例/Example]
url = "https://www.tiktok.com/api/item/detail/?WebIdLastTime=1712665533&aid=1988&app_language=en&app_name=tiktok_web&browser_language=en-US&browser_name=Mozilla&browser_online=true&browser_platform=Win32&browser_version=5.0%20%28Windows%29&channel=tiktok_web&cookie_enabled=true&device_id=7349090360347690538&device_platform=web_pc&focus_state=true&from_page=user&history_len=4&is_fullscreen=false&is_page_visible=true&language=en&os=windows&priority_region=US&referer=&region=US&root_referer=https%3A%2F%2Fwww.tiktok.com%2F&screen_height=1080&screen_width=1920&webcast_language=en&tz_name=America%2FTijuana&msToken=AYFCEapCLbMrS8uTLBoYdUMeeVLbCdFQ_QF_-OcjzJw1CPr4JQhWUtagy0k4a9IITAqi5Qxr2Vdh9mgCbyGxTnvWLa4ZVY6IiSf6lcST-tr0IXfl-r_ZTpzvWDoQfqOVsWCTlSNkhAwB-tap5g==&itemId=7339393672959757570"
user_agent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3"
"""
try:
data = await TikTokWebCrawler.gen_xbogus(url, user_agent)
return ResponseModel(code=200,
router=request.url.path,
data=data)
except Exception as e:
status_code = 400
detail = ErrorResponseModel(code=status_code,
router=request.url.path,
params=dict(request.query_params),
)
raise HTTPException(status_code=status_code, detail=detail.dict())
# 提取列表用户id
@router.get("/get_sec_user_id",
response_model=ResponseModel,
summary="提取列表用户id/Extract list user id")
async def get_sec_user_id(request: Request,
url: str = Query(
example="https://www.tiktok.com/@tiktok",
description="用户主页链接/User homepage link")):
"""
# [中文]
### 用途:
- 提取列表用户id
### 参数:
- url: 用户主页链接
### 返回:
- 用户id
# [English]
### Purpose:
- Extract list user id
### Parameters:
- url: User homepage link
### Return:
- User id
# [示例/Example]
url = "https://www.tiktok.com/@tiktok"
"""
try:
data = await TikTokWebCrawler.get_sec_user_id(url)
return ResponseModel(code=200,
router=request.url.path,
data=data)
except Exception as e:
status_code = 400
detail = ErrorResponseModel(code=status_code,
router=request.url.path,
params=dict(request.query_params),
)
raise HTTPException(status_code=status_code, detail=detail.dict())
# 提取列表用户id
@router.post("/get_all_sec_user_id",
response_model=ResponseModel,
summary="提取列表用户id/Extract list user id")
async def get_all_sec_user_id(request: Request,
url: List[str] = Body(
example=["https://www.tiktok.com/@tiktok"],
description="用户主页链接/User homepage link")):
"""
# [中文]
### 用途:
- 提取列表用户id
### 参数:
- url: 用户主页链接
### 返回:
- 用户id
# [English]
### Purpose:
- Extract list user id
### Parameters:
- url: User homepage link
### Return:
- User id
# [示例/Example]
url = ["https://www.tiktok.com/@tiktok"]
"""
try:
data = await TikTokWebCrawler.get_all_sec_user_id(url)
return ResponseModel(code=200,
router=request.url.path,
data=data)
except Exception as e:
status_code = 400
detail = ErrorResponseModel(code=status_code,
router=request.url.path,
params=dict(request.query_params),
)
raise HTTPException(status_code=status_code, detail=detail.dict())
# 提取单个作品id
@router.get("/get_aweme_id",
response_model=ResponseModel,
summary="提取单个作品id/Extract single video id")
async def get_aweme_id(request: Request,
url: str = Query(
example="https://www.tiktok.com/@owlcitymusic/video/7218694761253735723",
description="作品链接/Video link")):
"""
# [中文]
### 用途:
- 提取单个作品id
### 参数:
- url: 作品链接
### 返回:
- 作品id
# [English]
### Purpose:
- Extract single video id
### Parameters:
- url: Video link
### Return:
- Video id
# [示例/Example]
url = "https://www.tiktok.com/@owlcitymusic/video/7218694761253735723"
"""
try:
data = await TikTokWebCrawler.get_aweme_id(url)
return ResponseModel(code=200,
router=request.url.path,
data=data)
except Exception as e:
status_code = 400
detail = ErrorResponseModel(code=status_code,
router=request.url.path,
params=dict(request.query_params),
)
raise HTTPException(status_code=status_code, detail=detail.dict())
# 提取列表作品id
@router.post("/get_all_aweme_id",
response_model=ResponseModel,
summary="提取列表作品id/Extract list video id")
async def get_all_aweme_id(request: Request,
url: List[str] = Body(
example=["https://www.tiktok.com/@owlcitymusic/video/7218694761253735723"],
description="作品链接/Video link")):
"""
# [中文]
### 用途:
- 提取列表作品id
### 参数:
- url: 作品链接
### 返回:
- 作品id
# [English]
### Purpose:
- Extract list video id
### Parameters:
- url: Video link
### Return:
- Video id
# [示例/Example]
url = ["https://www.tiktok.com/@owlcitymusic/video/7218694761253735723"]
"""
try:
data = await TikTokWebCrawler.get_all_aweme_id(url)
return ResponseModel(code=200,
router=request.url.path,
data=data)
except Exception as e:
status_code = 400
detail = ErrorResponseModel(code=status_code,
router=request.url.path,
params=dict(request.query_params),
)
raise HTTPException(status_code=status_code, detail=detail.dict())
# 获取用户unique_id
@router.get("/get_unique_id",
response_model=ResponseModel,
summary="获取用户unique_id/Get user unique_id")
async def get_unique_id(request: Request,
url: str = Query(
example="https://www.tiktok.com/@tiktok",
description="用户主页链接/User homepage link")):
"""
# [中文]
### 用途:
- 获取用户unique_id
### 参数:
- url: 用户主页链接
### 返回:
- unique_id
# [English]
### Purpose:
- Get user unique_id
### Parameters:
- url: User homepage link
### Return:
- unique_id
# [示例/Example]
url = "https://www.tiktok.com/@tiktok"
"""
try:
data = await TikTokWebCrawler.get_unique_id(url)
return ResponseModel(code=200,
router=request.url.path,
data=data)
except Exception as e:
status_code = 400
detail = ErrorResponseModel(code=status_code,
router=request.url.path,
params=dict(request.query_params),
)
raise HTTPException(status_code=status_code, detail=detail.dict())
# 获取列表unique_id列表
@router.post("/get_all_unique_id",
response_model=ResponseModel,
summary="获取列表unique_id/Get list unique_id")
async def get_all_unique_id(request: Request,
url: List[str] = Body(
example=["https://www.tiktok.com/@tiktok"],
description="用户主页链接/User homepage link")):
"""
# [中文]
### 用途:
- 获取列表unique_id
### 参数:
- url: 用户主页链接
### 返回:
- unique_id
# [English]
### Purpose:
- Get list unique_id
### Parameters:
- url: User homepage link
### Return:
- unique_id
# [示例/Example]
url = ["https://www.tiktok.com/@tiktok"]
"""
try:
data = await TikTokWebCrawler.get_all_unique_id(url)
return ResponseModel(code=200,
router=request.url.path,
data=data)
except Exception as e:
status_code = 400
detail = ErrorResponseModel(code=status_code,
router=request.url.path,
params=dict(request.query_params),
)
raise HTTPException(status_code=status_code, detail=detail.dict())

View File

@ -0,0 +1,41 @@
from fastapi import Body, FastAPI, Query, Request, HTTPException
from pydantic import BaseModel
from typing import Any, Callable, Type, Optional, Dict
from functools import wraps
import datetime
app = FastAPI()
# 定义响应模型
class ResponseModel(BaseModel):
code: int = 200
router: str = "Endpoint path"
data: Optional[Any] = {}
# 定义错误响应模型
class ErrorResponseModel(BaseModel):
code: int = 400
message: str = "An error occurred."
support: str = "Please contact us on Github: https://github.com/Evil0ctal/Douyin_TikTok_Download_API"
time: str = datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")
router: str
params: dict = {}
# 混合解析响应模型
class HybridResponseModel(BaseModel):
code: int = 200
router: str = "Hybrid parsing single video endpoint"
data: Optional[Any] = {}
# iOS_Shortcut响应模型
class iOS_Shortcut(BaseModel):
version: str
update: str
link: str
link_en: str
note: str
note_en: str

25
app/api/router.py Normal file
View File

@ -0,0 +1,25 @@
from fastapi import APIRouter
from app.api.endpoints import (
tiktok_web,
tiktok_app,
douyin_web,
hybrid_parsing, ios_shortcut, download,
)
router = APIRouter()
# TikTok routers
router.include_router(tiktok_web.router, prefix="/tiktok/web", tags=["TikTok-Web-API"])
router.include_router(tiktok_app.router, prefix="/tiktok/app", tags=["TikTok-App-API"])
# Douyin routers
router.include_router(douyin_web.router, prefix="/douyin/web", tags=["Douyin-Web-API"])
# Hybrid routers
router.include_router(hybrid_parsing.router, prefix="/hybrid", tags=["Hybrid-API"])
# iOS_Shortcut routers
router.include_router(ios_shortcut.router, prefix="/ios", tags=["iOS-Shortcut"])
# Download routers
router.include_router(download.router, tags=["Download"])

122
app/main.py Normal file
View File

@ -0,0 +1,122 @@
# ==============================================================================
# Copyright (C) 2021 Evil0ctal
#
# This file is part of the Douyin_TikTok_Download_API project.
#
# This project is licensed under the Apache License 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at:
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
#         __
#        />  フ
#       |  _  _ l
#       ` ミ_x
#      /      | Feed me Stars ⭐
#     /  ヽ   ノ
#     │  | | |
#  / ̄|   | | |
#  | ( ̄ヽ__ヽ_)__)
#  \二つ
# ==============================================================================
#
# Contributor Link:
# - https://github.com/Evil0ctal
# - https://github.com/Johnserf-Seed
#
# ==============================================================================
# FastAPI APP
import uvicorn
from fastapi import FastAPI
from app.api.router import router as api_router
# PyWebIO APP
from app.web.app import MainView
from pywebio.platform.fastapi import asgi_app
# API Tags
tags_metadata = [
{
"name": "Hybrid-API",
"description": "**(混合数据接口/Hybrid-API data endpoints)**",
},
{
"name": "Douyin-Web-API",
"description": "**(抖音Web数据接口/Douyin-Web-API data endpoints)**",
},
{
"name": "TikTok-Web-API",
"description": "**(TikTok-Web-API数据接口/TikTok-Web-API data endpoints)**",
},
{
"name": "TikTok-App-API",
"description": "**(TikTok-App-API数据接口/TikTok-App-API data endpoints)**",
},
{
"name": "iOS-Shortcut",
"description": "**(iOS快捷指令数据接口/iOS-Shortcut data endpoints)**",
},
{
"name": "Download",
"description": "**(下载数据接口/Download data endpoints)**",
},
]
version = 'V4.0.0'
update_time = '2024-04-20'
environment = 'development'
description = f"""
### [中文]
#### 关于
- **Github**: [Douyin_TikTok_Download_API]("https://github.com/Evil0ctal/Douyin_TikTok_Download_API")
- **版本**: `{version}`
- **更新时间**: `{update_time}`
- **环境**: `{environment}`
- **文档**: [API Documentation](https://api.douyin.wtf)
#### 备注
- 本项目仅供学习交流使用不得用于违法用途否则后果自负
- 如果你不想自己部署可以直接使用我们的在线API服务[Douyin_TikTok_Download_API](https://api.douyin.wtf)
- 如果你需要更稳定以及更多功能的API服务可以使用付费API服务[TikHub API](https://beta.tikhub.io/)
### [English]
#### About
- **Github**: [Douyin_TikTok_Download_API]("https://github.com/Evil0ctal/Douyin_TikTok_Download_API")
- **Version**: `{version}`
- **Last Updated**: `{update_time}`
- **Environment**: `{environment}`
- **Documentation**: [API Documentation](https://api.douyin.wtf)
#### Note
- This project is for learning and communication only, and shall not be used for illegal purposes, otherwise the consequences shall be borne by yourself.
- If you do not want to deploy it yourself, you can directly use our online API service: [Douyin_TikTok_Download_API](https://api.douyin.wtf)
- If you need a more stable and feature-rich API service, you can use the paid API service: [TikHub API](https://beta.tikhub.io)
"""
app = FastAPI(
title="Douyin TikTok Download API",
description=description,
version=version,
openapi_tags=tags_metadata,
docs_url='/docs', # 文档路径
redoc_url='/redoc', # redoc文档路径
)
# API router
app.include_router(api_router, prefix="/api")
# PyWebIO APP
webapp = asgi_app(lambda: MainView().main_view())
app.mount("/", webapp)
if __name__ == '__main__':
uvicorn.run(app, host="0.0.0.0", port=80)

92
app/web/app.py Normal file
View File

@ -0,0 +1,92 @@
# PyWebIO组件/PyWebIO components
import os
import yaml
from pywebio import session, config as pywebio_config
from pywebio.input import *
from pywebio.output import *
from app.web.views.About import about_pop_window
from app.web.views.Document import api_document_pop_window
from app.web.views.Downloader import downloader_pop_window
from app.web.views.EasterEgg import a
from app.web.views.ParseVideo import parse_video
from app.web.views.Shortcuts import ios_pop_window
# PyWebIO的各个视图/Views of PyWebIO
from app.web.views.ViewsUtils import ViewsUtils
# 读取上级再上级目录的配置文件
config_path = os.path.join(os.path.dirname(os.path.dirname(os.path.dirname(__file__))), 'config.yaml')
with open(config_path, 'r', encoding='utf-8') as file:
_config = yaml.safe_load(file)
pywebio_config(theme=_config['PyWebIO_Theme'],
title=_config['Tab_Title'],
description=_config['Description'],
js_file=[
# 整一个看板娘,二次元浓度++
_config['Live2D_JS'] if _config['Live2D_Enable'] else None,
])
class MainView:
def __init__(self):
self.utils = ViewsUtils()
# 主界面/Main view
def main_view(self):
# 左侧导航栏/Left navbar
with use_scope('main'):
# 设置favicon/Set favicon
favicon_url = _config['Favicon']
session.run_js(f"""
$('head').append('<link rel="icon" type="image/png" href="{favicon_url}">')
""")
# 修改footer/Remove footer
session.run_js("""$('footer').remove()""")
# 设置不允许referrer/Set no referrer
session.run_js("""$('head').append('<meta name=referrer content=no-referrer>');""")
# 设置标题/Set title
title = self.utils.t("TikTok/抖音无水印在线解析下载",
"Douyin/TikTok online parsing and download without watermark")
put_html(f"""
<div align="center">
<a href="/" alt="logo" ><img src="{favicon_url}" width="100"/></a>
<h1 align="center">{title}</h1>
</div>
""")
# 设置导航栏/Navbar
put_row(
[
put_button(self.utils.t("快捷指令", 'iOS Shortcut'),
onclick=lambda: ios_pop_window(), link_style=True, small=True),
put_button(self.utils.t("开放接口", 'Open API'),
onclick=lambda: api_document_pop_window(), link_style=True, small=True),
put_button(self.utils.t("下载器", "Downloader"),
onclick=lambda: downloader_pop_window(), link_style=True, small=True),
put_button(self.utils.t("关于", 'About'),
onclick=lambda: about_pop_window(), link_style=True, small=True),
])
# 设置功能选择/Function selection
options = [
# Index: 0
self.utils.t('🔍批量解析视频', '🔍Batch Parse Video'),
# Index: 1
self.utils.t('🔍解析用户主页视频', '🔍Parse User Homepage Video'),
# Index: 2
self.utils.t('🥚小彩蛋', '🥚Easter Egg'),
]
select_options = select(
self.utils.t('请在这里选择一个你想要的功能吧 ~', 'Please select a function you want here ~'),
required=True,
options=options,
help_text=self.utils.t('📎选上面的选项然后点击提交', '📎Select the options above and click Submit')
)
# 根据输入运行不同的函数
if select_options == options[0]:
parse_video()
elif select_options == options[1]:
put_markdown(self.utils.t('暂未开放,敬请期待~', 'Not yet open, please look forward to it~'))
elif select_options == options[2]:
a() if _config['Easter_Egg'] else put_markdown(self.utils.t('没有小彩蛋哦~', 'No Easter Egg~'))

23
app/web/views/About.py Normal file
View File

@ -0,0 +1,23 @@
from pywebio.output import popup, put_markdown, put_html, put_text, put_link, put_image
from app.web.views.ViewsUtils import ViewsUtils
t = ViewsUtils().t
# 关于弹窗/About pop-up
def about_pop_window():
with popup(t('更多信息', 'More Information')):
put_html('<h3>👀{}</h3>'.format(t('访问记录', 'Visit Record')))
put_image('https://views.whatilearened.today/views/github/evil0ctal/TikTokDownload_PyWebIO.svg',
title='访问记录')
put_html('<hr>')
put_html('<h3>⭐Github</h3>')
put_markdown('[Douyin_TikTok_Download_API](https://github.com/Evil0ctal/Douyin_TikTok_Download_API)')
put_html('<hr>')
put_html('<h3>🎯{}</h3>'.format(t('反馈', 'Feedback')))
put_markdown('{}[issues](https://github.com/Evil0ctal/Douyin_TikTok_Download_API/issues)'.format(
t('Bug反馈', 'Bug Feedback')))
put_html('<hr>')
put_html('<h3>💖WeChat</h3>')
put_markdown('WeChat[Evil0ctal](https://mycyberpunk.com/)')
put_html('<hr>')

30
app/web/views/Document.py Normal file
View File

@ -0,0 +1,30 @@
from pywebio.output import popup, put_markdown, put_html, put_text, put_link
from app.web.views.ViewsUtils import ViewsUtils
t = ViewsUtils().t
# API文档弹窗/API documentation pop-up
def api_document_pop_window():
with popup(t("📑API文档", "📑API Document")):
put_markdown(t("> 介绍",
"> Introduction"))
put_markdown(t("你可以利用本项目提供的API接口来获取抖音/TikTok的数据具体接口文档请参考下方链接。",
"You can use the API provided by this project to obtain Douyin/TikTok data. For specific API documentation, please refer to the link below."))
put_markdown(t("如果API不可用请尝试自己部署本项目然后再配置文件中修改cookie的值。",
"If the API is not available, please try to deploy this project by yourself, and then modify the value of the cookie in the configuration file."))
put_link('[API Docs]', '/docs', new_window=True)
put_markdown("----")
put_markdown(t("> 更多接口",
"> More APIs"))
put_markdown(t("如果你想要使用更多且更稳定的API服务可以使用付费API服务",
"If you want to use more and more stable API services, you can use paid API services"))
put_link('[TikHub API]', 'https://api.tikhub.io', new_window=True)
put_markdown("----")
put_markdown(t("> 限时免费测试",
"> Free test for a limited time"))
put_markdown(t("这里也有一个测试版的API服务你可以直接免费使用",
"There is also a beta version of the API service, which you can use for free"))
put_markdown(t("测试接口只会保留一段时间,不保证数据的稳定性",
"The test interface will only be retained for a period of time, and the stability of the data is not guaranteed"))
put_link('[TikHub Beta API]', 'https://beta.tikhub.io', new_window=True)

View File

@ -0,0 +1,18 @@
from pywebio.output import popup, put_markdown, put_html, put_text, put_link
from app.web.views.ViewsUtils import ViewsUtils
t = ViewsUtils().t
# 下载器弹窗/Downloader pop-up
def downloader_pop_window():
with popup(t("💾 下载器", "💾 Downloader")):
put_markdown(t("> 桌面端下载器", "> Desktop Downloader"))
put_markdown(t("你可以使用下面的开源项目在桌面端下载视频:",
"You can use the following open source projects to download videos on the desktop:"))
put_markdown("1. [TikTokDownload](https://github.com/Johnserf-Seed/TikTokDownload)")
put_markdown(t("> 备注", "> Note"))
put_markdown(t("1. 请注意下载器的使用规范,不要用于违法用途。",
"1. Please pay attention to the use specifications of the downloader and do not use it for illegal purposes."))
put_markdown(t("2. 下载器相关问题请咨询对应项目的开发者。",
"2. For issues related to the downloader, please consult the developer of the corresponding project."))

View File

@ -0,0 +1,60 @@
import numpy as np
import time
import pyfiglet
from pywebio import start_server
from pywebio.output import put_text, clear, put_html
def a():
H, W = 60, 80
g = np.random.choice([0, 1], size=(H, W))
def u():
n = g.copy()
for i in range(H):
for j in range(W):
t = sum([g[i, (j - 1) % W], g[i, (j + 1) % W], g[(i - 1) % H, j], g[(i + 1) % H, j],
g[(i - 1) % H, (j - 1) % W], g[(i - 1) % H, (j + 1) % W], g[(i + 1) % H, (j - 1) % W],
g[(i + 1) % H, (j + 1) % W]])
n[i, j] = 1 if g[i, j] == 0 and t == 3 else 0 if g[i, j] == 1 and (t < 2 or t > 3) else g[i, j]
return n
def m(s):
put_text(pyfiglet.figlet_format(s, font="slant"))
def c():
m(''.join([chr(int(c, 2)) for c in
['01000101', '01110110', '01101001', '01101100', '01001111', '01100011', '01110100', '01100001',
'01101100', '00001010', '01000111', '01000001', '01001101', '01000101', '00001010', '01001111',
'01000110', '00001010', '01001100', '01001001', '01000110', '01000101', '00001010', '00110010',
'00110000', '00110010', '00110100']]));
time.sleep(3)
for i in range(3, 0, -1): clear(); m(str(i)); time.sleep(1)
clear()
def h(g):
return '<table id="life-grid" style="table-layout: fixed; border-spacing:0;">' + ''.join('<tr>' + ''.join(
f'<td style="width:10px; height:10px; background:{"black" if c else "white"};"></td>' for c in r) + '</tr>'
for r in
g) + '</table>'
c();
put_html(h(g))
def r(g):
return f"<script>" + ''.join(
f'document.getElementById("life-grid").rows[{i}].cells[{j}].style.background = "{"black" if g[i, j] else "white"}";'
for i in range(H) for j in range(W)) + "</script>"
e = time.time() + 120
while time.time() < e:
time.sleep(0.1);
g = u();
put_html(r(g))
if __name__ == '__main__':
# A boring code is ready to run!
# 原神,启动!
start_server(a, port=80)

235
app/web/views/ParseVideo.py Normal file
View File

@ -0,0 +1,235 @@
import asyncio
import os
import time
import yaml
from pywebio.input import *
from pywebio.output import *
from pywebio_battery import put_video
from app.web.views.ViewsUtils import ViewsUtils
from crawlers.hybrid.hybrid_crawler import HybridCrawler
HybridCrawler = HybridCrawler()
# 读取上级再上级目录的配置文件
config_path = os.path.join(os.path.dirname(os.path.dirname(os.path.dirname(os.path.dirname(__file__)))), 'config.yaml')
with open(config_path, 'r', encoding='utf-8') as file:
config = yaml.safe_load(file)
# 网站域名/Website domain
domain = config['Domain']
# 校验输入值/Validate input value
def valid_check(input_data: str):
# 检索出所有链接并返回列表/Retrieve all links and return a list
url_list = ViewsUtils.find_url(input_data)
# 总共找到的链接数量/Total number of links found
total_urls = len(url_list)
if total_urls == 0:
warn_info = ViewsUtils.t('没有检测到有效的链接,请检查输入的内容是否正确。',
'No valid link detected, please check if the input content is correct.')
return warn_info
else:
# 最大接受提交URL的数量/Maximum number of URLs accepted
max_urls = config['Max_Take_URLs']
if total_urls > int(max_urls):
warn_info = ViewsUtils.t(f'输入的链接太多啦,当前只会处理输入的前{max_urls}个链接!',
f'Too many links input, only the first {max_urls} links will be processed!')
return warn_info
# 错误处理/Error handling
def error_do(reason: str, value: str) -> None:
# 输出一个毫无用处的信息
put_html("<hr>")
put_error(
ViewsUtils.t("发生了一个错误,程序将跳过这个输入值,继续处理下一个输入值。",
"An error occurred, the program will skip this input value and continue to process the next input value."))
put_html(f"<h3>⚠{ViewsUtils.t('详情', 'Details')}</h3>")
put_table([
[
ViewsUtils.t('原因', 'reason'),
ViewsUtils.t('输入值', 'input value')
],
[
reason,
value
]
])
put_markdown(ViewsUtils.t('> 可能的原因:', '> Possible reasons:'))
put_markdown(ViewsUtils.t("- 视频已被删除或者链接不正确。",
"- The video has been deleted or the link is incorrect."))
put_markdown(ViewsUtils.t("- 接口风控,请求过于频繁。",
"- Interface risk control, request too frequent.")),
put_markdown(ViewsUtils.t("> 寻求帮助:", "> Seek help:"))
put_markdown(ViewsUtils.t(
"- 你可以尝试再次解析,或者尝试自行部署项目,然后替换`./app/crawlers/平台文件夹/config.yaml`中的`cookie`值。",
"- You can try to parse again, or try to deploy the project by yourself, and then replace the `cookie` value in `./app/crawlers/platform folder/config.yaml`."))
put_markdown(
"- GitHub Issue: [Evil0ctal/Douyin_TikTok_Download_API](https://github.com/Evil0ctal/Douyin_TikTok_Download_API/issues)")
put_html("<hr>")
def parse_video():
placeholder = ViewsUtils.t(
"批量解析请直接粘贴多个口令或链接无需使用符号分开支持抖音和TikTok链接混合暂时不支持作者主页链接批量解析。",
"Batch parsing, please paste multiple passwords or links directly, no need to use symbols to separate, support for mixing Douyin and TikTok links, temporarily not support for author home page link batch parsing.")
input_data = textarea(
ViewsUtils.t('请将抖音或TikTok的分享口令或网址粘贴于此',
"Please paste the share code or URL of [Douyin|TikTok] here"),
type=TEXT,
validate=valid_check,
required=True,
placeholder=placeholder,
position=0)
url_lists = ViewsUtils.find_url(input_data)
# 解析开始时间
start = time.time()
# 成功/失败统计
success_count = 0
failed_count = 0
# 链接总数
url_count = len(url_lists)
# 解析成功的url
success_list = []
# 解析失败的url
failed_list = []
# 输出一个提示条
with use_scope('loading_text'):
# 输出一个分行符
put_row([put_html('<br>')])
put_warning(ViewsUtils.t('Server酱正收到你输入的链接啦(◍•ᴗ•◍)\n正在努力处理中,请稍等片刻...',
'ServerChan is receiving your input link! (◍•ᴗ•◍)\nEfforts are being made, please wait a moment...'))
# 结果页标题
put_scope('result_title')
# 遍历链接列表
for url in url_lists:
# 链接编号
url_index = url_lists.index(url) + 1
# 解析
try:
data = asyncio.run(HybridCrawler.hybrid_parsing_single_video(url, minimal=True))
except Exception as e:
error_msg = str(e)
with use_scope(str(url_index)):
error_do(reason=error_msg, value=url)
failed_count += 1
failed_list.append(url)
continue
# 创建一个视频/图集的公有变量
url_type = ViewsUtils.t('视频', 'Video') if data.get('type') == 'video' else ViewsUtils.t('图片', 'Image')
platform = data.get('platform')
table_list = [
[ViewsUtils.t('类型', 'type'), ViewsUtils.t('内容', 'content')],
[ViewsUtils.t('解析类型', 'Type'), url_type],
[ViewsUtils.t('平台', 'Platform'), platform],
[f'{url_type} ID', data.get('aweme_id')],
[ViewsUtils.t(f'{url_type}描述', 'Description'), data.get('desc')],
[ViewsUtils.t('作者昵称', 'Author nickname'), data.get('author').get('nickname')],
[ViewsUtils.t('作者ID', 'Author ID'), data.get('author').get('unique_id')],
[ViewsUtils.t('API链接', 'API URL'),
put_link(
ViewsUtils.t('点击查看', 'Click to view'),
f"{domain}/api/hybrid/video_data?url={url}&minimal=false",
new_window=True)],
[ViewsUtils.t('API链接-精简', 'API URL-Minimal'),
put_link(ViewsUtils.t('点击查看', 'Click to view'),
f"{domain}/api/hybrid/video_data?url={url}&minimal=true",
new_window=True)]
]
# 如果是视频/If it's video
if url_type == ViewsUtils.t('视频', 'Video'):
# 添加视频信息
table_list.insert(4, [ViewsUtils.t('视频链接-水印', 'Video URL-Watermark'),
put_link(ViewsUtils.t('点击查看', 'Click to view'),
data.get('video_data').get('wm_video_url_HQ'), new_window=True)])
table_list.insert(5, [ViewsUtils.t('视频链接-无水印', 'Video URL-No Watermark'),
put_link(ViewsUtils.t('点击查看', 'Click to view'),
data.get('video_data').get('nwm_video_url_HQ'), new_window=True)])
table_list.insert(6, [ViewsUtils.t('视频下载-水印', 'Video Download-Watermark'),
put_link(ViewsUtils.t('点击下载', 'Click to download'),
f"{domain}/download?url={url}&prefix=true&watermark=true",
new_window=True)])
table_list.insert(7, [ViewsUtils.t('视频下载-无水印', 'Video Download-No-Watermark'),
put_link(ViewsUtils.t('点击下载', 'Click to download'),
f"{domain}/download?url={url}&prefix=true&watermark=false",
new_window=True)])
# 添加视频信息
table_list.insert(0, [put_video(data.get('video_data').get('nwm_video_url_HQ'), poster=None, loop=True, width='50%')])
# 如果是图片/If it's image
elif url_type == ViewsUtils.t('图片', 'Image'):
# 添加图片下载链接
table_list.insert(4, [ViewsUtils.t('图片打包下载-水印', 'Download images ZIP-Watermark'),
put_link(ViewsUtils.t('点击下载', 'Click to download'),
f"{domain}/download?url={url}&prefix=true&watermark=true",
new_window=True)])
table_list.insert(5, [ViewsUtils.t('图片打包下载-无水印', 'Download images ZIP-No-Watermark'),
put_link(ViewsUtils.t('点击下载', 'Click to download'),
f"{domain}/download?url={url}&prefix=true&watermark=false",
new_window=True)])
# 添加图片信息
no_watermark_image_list = data.get('image_data').get('no_watermark_image_list')
for image in no_watermark_image_list:
table_list.append(
[ViewsUtils.t('图片预览(如格式可显示): ', 'Image preview (if the format can be displayed):'),
put_image(image, width='50%')])
table_list.append([ViewsUtils.t('图片直链: ', 'Image URL:'),
put_link(ViewsUtils.t('⬆️点击打开图片⬆️', 'Click to open image⬆'), image,
new_window=True)])
# 向网页输出表格/Put table on web page
with use_scope(str(url_index)):
# 显示进度
put_info(
ViewsUtils.t(f'正在解析第{url_index}/{url_count}个链接: ',
f'Parsing the {url_index}/{url_count}th link: '),
put_link(url, url, new_window=True), closable=True)
put_table(table_list)
put_html('<hr>')
scroll_to(str(url_index))
success_count += 1
success_list.append(url)
# print(success_count: {success_count}, success_list: {success_list}')
# 全部解析完成跳出for循环/All parsing completed, break out of for loop
with use_scope('result_title'):
put_row([put_html('<br>')])
put_markdown(ViewsUtils.t('## 📝解析结果:', '## 📝Parsing results:'))
put_row([put_html('<br>')])
with use_scope('result'):
# 清除进度条
clear('loading_text')
# 滚动至result
scroll_to('result')
# for循环结束向网页输出成功提醒
put_success(ViewsUtils.t('解析完成啦 ♪(・ω・)ノ\n请查看以下统计信息如果觉得有用的话请在GitHub上帮我点一个Star吧',
'Parsing completed ♪(・ω・)ノ\nPlease check the following statistics, and if you think it\'s useful, please help me click a Star on GitHub!'))
# 将成功,失败以及总数量显示出来并且显示为代码方便复制
put_markdown(
f'**{ViewsUtils.t("成功", "Success")}:** {success_count} **{ViewsUtils.t("失败", "Failed")}:** {failed_count} **{ViewsUtils.t("总数量", "Total")}:** {success_count + failed_count}')
# 成功列表
if success_count != url_count:
put_markdown(f'**{ViewsUtils.t("成功列表", "Success list")}:**')
put_code('\n'.join(success_list))
# 失败列表
if failed_count > 0:
put_markdown(f'**{ViewsUtils.t("失败列表", "Failed list")}:**')
put_code('\n'.join(failed_list))
# 将url_lists显示为代码方便复制
put_markdown(ViewsUtils.t('**以下是您输入的所有链接:**', '**The following are all the links you entered:**'))
put_code('\n'.join(url_lists))
# 解析结束时间
end = time.time()
# 计算耗时,保留两位小数
time_consuming = round(end - start, 2)
# 显示耗时
put_markdown(f"**{ViewsUtils.t('耗时', 'Time consuming')}:** {time_consuming}s")
# 放置一个按钮,点击后跳转到顶部
put_button(ViewsUtils.t('回到顶部', 'Back to top'), onclick=lambda: scroll_to('1'), color='success',
outline=True)
# 返回主页链接
put_link(ViewsUtils.t('再来一波 (つ´ω`)つ', 'Another wave (つ´ω`)つ'), '/')

View File

@ -0,0 +1,48 @@
import os
import yaml
from pywebio.output import popup, put_markdown, put_html, put_text, put_link
from app.web.views.ViewsUtils import ViewsUtils
t = ViewsUtils().t
# 读取上级再上级目录的配置文件
config_path = os.path.join(os.path.dirname(os.path.dirname(os.path.dirname(os.path.dirname(__file__)))), 'config.yaml')
with open(config_path, 'r', encoding='utf-8') as file:
config = yaml.safe_load(file)
config = config['iOS_Shortcut']
# iOS快捷指令弹窗/IOS shortcut pop-up
def ios_pop_window():
with popup(t("iOS快捷指令", "iOS Shortcut")):
version = config["iOS_Shortcut_Version"]
update = config['iOS_Shortcut_Update_Time']
link = config['iOS_Shortcut_Link']
link_en = config['iOS_Shortcut_Link_EN']
note = config['iOS_Shortcut_Update_Note']
note_en = config['iOS_Shortcut_Update_Note_EN']
put_markdown(t('#### 📢 快捷指令介绍:', '#### 📢 Shortcut Introduction:'))
put_markdown(
t('快捷指令运行在iOS平台本快捷指令可以快速调用本项目的公共API将抖音或TikTok的视频或图集下载到你的手机相册中暂时只支持单个链接进行下载。',
'The shortcut runs on the iOS platform, and this shortcut can quickly call the public API of this project to download the video or album of Douyin or TikTok to your phone album. It only supports single link download for now.'))
put_markdown(t('#### 📲 使用方法 ①:', '#### 📲 Operation method ①:'))
put_markdown(t('在抖音或TikTok的APP内浏览你想要无水印保存的视频或图集。',
'The shortcut needs to be used in the Douyin or TikTok app, browse the video or album you want to save without watermark.'))
put_markdown(t('然后点击右下角分享按钮,选择更多,然后下拉找到 "抖音TikTok无水印下载" 这个选项。',
'Then click the share button in the lower right corner, select more, and then scroll down to find the "Douyin TikTok No Watermark Download" option.'))
put_markdown(t('如遇到通知询问是否允许快捷指令访问xxxx (域名或服务器),需要点击允许才可以正常使用。',
'If you are asked whether to allow the shortcut to access xxxx (domain name or server), you need to click Allow to use it normally.'))
put_markdown(t('该快捷指令会在你相册创建一个新的相薄方便你浏览保存的内容。',
'The shortcut will create a new album in your photo album to help you browse the saved content.'))
put_markdown(t('#### 📲 使用方法 ②:', '#### 📲 Operation method ②:'))
put_markdown(t('在抖音或TikTok的视频下方点击分享然后点击复制链接然后去快捷指令APP中运行该快捷指令。',
'Click share below the video of Douyin or TikTok, then click to copy the link, then go to the shortcut command APP to run the shortcut command.'))
put_markdown(t('如果弹窗询问是否允许读取剪切板请同意,随后快捷指令将链接内容保存至相册中。',
'if the pop-up window asks whether to allow reading the clipboard, please agree, and then the shortcut command will save the link content to the album middle.'))
put_html('<hr>')
put_text(t(f"最新快捷指令版本: {version}", f"Latest shortcut version: {version}"))
put_text(t(f"快捷指令更新时间: {update}", f"Shortcut update time: {update}"))
put_text(t(f"快捷指令更新内容: {note}", f"Shortcut update content: {note_en}"))
put_link("[点击获取快捷指令 - 中文]", link, new_window=True)
put_html("<br>")
put_link("[Click get Shortcut - English]", link_en, new_window=True)

View File

@ -0,0 +1,24 @@
import re
from pywebio.output import get_scope, clear
from pywebio.session import info as session_info
class ViewsUtils:
# 自动检测语言返回翻译/Auto detect language to return translation
@staticmethod
def t(zh: str, en: str) -> str:
return zh if 'zh' in session_info.user_language else en
# 清除前一个scope/Clear the previous scope
@staticmethod
def clear_previous_scope():
_scope = get_scope(-1)
clear(_scope)
# 解析抖音分享口令中的链接并返回列表/Parse the link in the Douyin share command and return a list
@staticmethod
def find_url(string: str) -> list:
url = re.findall('http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\(\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+', string)
return url

View File

@ -12,10 +12,6 @@ echo 'installing PIP3...'
apt install python3-pip
echo 'installing NodeJS...'
apt install nodejs
echo 'Creating path: /www/wwwroot'
mkdir -p /www/wwwroot
@ -30,68 +26,13 @@ cd Douyin_TikTok_Download_API/ || exit
pip install -r requirements.txt
echo 'Please edit config.yml, all input must be numbers!'
python3 config.py
echo 'Add Douyin_TikTok_Download_API to system service'
cp /www/wwwroot/Douyin_TikTok_Download_API/daemon/* /etc/systemd/system/
read -r -p "Run API or Web? [api/web/all/quit] " input
case $input in
[aA][pP][iI]|[aA])
read -r -p "Do you want to start the api service when system boot? [y/n] " input
case $input in
[yY])
systemctl enable web_api.service
echo "API service will start when system boot!"
;;
[nN]| *)
echo "You can start the service by running: systemctl start web_api.service"
;;
esac
echo "Starting API..."
systemctl start web_api.service
echo "API is running! You can visit http://your_ip:port"
echo "You can stop the api service by running: systemctl stop web_api.service"
;;
[wW][eE][bB]|[wW])
read -r -p "Do you want to start the app service when system boot? [y/n] " input
case $input in
[yY])
systemctl enable web_app.service
echo "Web service will start when system boot!"
;;
[nN]| *)
echo "You can start the service by running: systemctl start web_app.service"
;;
esac
echo "Staining APP..."
systemctl start web_app.service
echo "API is running! You can visit http://your_ip:port"
echo "You can stop the api service by running: systemctl stop web_app.service"
;;
[aA][lL][lL])
read -r -p "Do you want to start the app and api service when system boot? [y/n] " input
case $input in
[yY])
systemctl enable web_app.service
systemctl enable web_api.service
;;
[nN]| *)
echo "You can start them on boot by these commands:"
echo "systemctl enable (web_app.service||web_api.service)"
;;
esac
echo "Starting WEB and API Services..."
systemctl start web_app.service
systemctl start web_api.service
echo "API and APP service are running!"
echo "You can stop the api service by running following command: "
echo "systemctl stop (web_app.service||web_api.service)"
;;
*)
echo "Exiting without running anything..."
exit 1
;;
esac
systemctl enable Douyin_TikTok_Download_API.service
echo 'Starting Douyin_TikTok_Download_API service'
systemctl start Douyin_TikTok_Download_API.service

View File

@ -1,14 +1,12 @@
#!/bin/bash
read -r -p "Do you want to update the project? [y/n] " input
read -r -p "Do you want to update Douyin_TikTok_Download_API? [y/n] " input
case $input in
[yY])
cd ..
git pull
echo "Restarting the service - systemctl restart web_app.service"
systemctl restart web_app.service
echo "Restarting the service - systemctl restart web_api.service"
systemctl restart web_api.service
echo "Restarting Douyin_TikTok_Download_API service"
systemctl restart Douyin_TikTok_Download_API.service
echo "Successfully restarted all services!"
;;
[nN]| *)

View File

@ -1,60 +0,0 @@
import yaml
config_path = 'config.yml'
with open(config_path, 'r', encoding='utf-8') as file:
config = yaml.safe_load(file)
def api_config():
api_default_port = config['Web_API']['Port']
api_new_port = input(
f'Default API port: {api_default_port}\nIf you want use different port input new API port here: ')
if api_new_port.isdigit():
if int(api_new_port) == int(api_default_port):
print(f'Use default port for web_app.py: {api_default_port}')
else:
print(f'Use new port for web_api.py: {api_new_port}')
config['Web_API']['Port'] = int(api_new_port)
with open(config_path, "w", encoding="utf-8") as file:
yaml.dump(config, file, allow_unicode=True)
else:
print(f'Use default port for web_app.py: {api_default_port}')
req_limit = config['Web_API']['Rate_Limit']
new_req_limit = input(
f'Default API rate limit: {req_limit}\nIf you want use different rate limit input new rate limit here: ')
if new_req_limit.isdigit():
if int(new_req_limit) == int(req_limit.split('/')[0]):
print(f'Use default rate limit for web_api.py : {req_limit}')
else:
print(f'Use new rate limit: {new_req_limit}/minute')
config['Web_API']['Rate_Limit'] = f'{new_req_limit}/minute'
with open(config_path, "w", encoding="utf-8") as file:
yaml.dump(config, file, allow_unicode=True)
else:
print(f'Use default rate limit for web_api.py: {req_limit}')
def app_config():
app_default_port = config['Web_APP']['Port']
app_new_port = input(
f'Default App port: {app_default_port}\nIf you want use different port input new App port here: ')
if app_new_port.isdigit():
if int(app_new_port) == int(app_default_port):
print(f'Use default port for web_app.py: {app_default_port}')
else:
print(f'Use new port: {app_new_port}')
config['Web_APP']['Port'] = int(app_new_port)
with open(config_path, "w", encoding="utf-8") as file:
yaml.dump(config, file, allow_unicode=True)
else:
print(f'Use default port for web_app.py : {app_default_port}')
if __name__ == '__main__':
api_config()
app_config()

31
config.yaml Normal file
View File

@ -0,0 +1,31 @@
iOS_Shortcut:
iOS_Shortcut_Version: 6.0
iOS_Shortcut_Update_Time: 2024/04/22
iOS_Shortcut_Link: https://www.icloud.com/shortcuts/4465d514869e4ca585074d40328f3e0e
iOS_Shortcut_Link_EN: https://www.icloud.com/shortcuts/58e3a2cbac784a6782f1031c6b1dd9f8
iOS_Shortcut_Update_Note: 重新适配https://api.douyin.wtf(API-V1 3.0.0版本)
iOS_Shortcut_Update_Note_EN: Re-adapt https://api.douyin.wtf (API-V1 3.0.0 version)
Domain: https://douyin.wtf
Tab_Title: Douyin_TikTok_Download_API
Description: Douyin_TikTok_Download_API is a free open-source API service for Douyin/TikTok. It provides a simple, fast, and stable API for developers to develop applications based on Douyin/TikTok.
Favicon: https://raw.githubusercontent.com/Evil0ctal/Douyin_TikTok_Download_API/main/logo/logo192.png
PyWebIO_Theme: minty
Live2D_Enable: true
Live2D_JS: https://fastly.jsdelivr.net/gh/TikHubIO/TikHub_live2d@latest/autoload.js
Easter_Egg: true
Max_Take_URLs: 30
Download_Switch: true
Download_File_Prefix: douyin.wtf_
# 默认下载目录/Default download directory
Download_Path: "./download"

View File

@ -1,109 +0,0 @@
# -*- encoding: utf-8 -*-
# @Author: https://github.com/Evil0ctal/
# @Time: 2021/11/06
# @Update: 2024/03/25
# @Function:
# 项目的配置文件/Config file of the project
# 去看看我们的另外一个项目吧一个YAML文件的多态键值路径映射解析器https://github.com/PKVPM/PKVPM
# Check out our other project, a polymorphic key-value path mapping parser for YAML files:
Scraper: # scraper.py
# 是否使用代理(如果部署在IP受限国家需要开启默认为False关闭请自行收集代理下面代理仅作为示例不保证可用性)
# Whether to use proxy (if deployed in a country with IP restrictions, it needs to be turned on by default, False is closed. Please collect proxies yourself. The following proxies are only for reference and do not guarantee availability)
Proxy_switch: false
# 是否根据不同协议(http/https)使用不同代理设置为True时修改Http_proxy/Https_proxy这两个变量的值
# Whether to use different proxies for different protocols (http/https). When set to True, modify the values of the two variables Http_proxy/Https_proxy
Use_different_protocols: false
# http/https协议都使用以下代理(Use_different_protocols为False时生效)
# Both http/https protocols use the following proxy (effective when Use_different_protocols is False)
All: "45.167.124.5:9992"
# http协议使用以下代理(Use_different_protocols为True时生效)
# The http protocol uses the following proxy (effective when Use_different_protocols is True)
Http_proxy: "http://45.167.124.5:9992"
# https协议使用以下代理(Use_different_protocols为True时生效)
# The https protocol uses the following proxy (effective when Use_different_protocols is True)
Https_proxy: "https://45.167.124.5:9992"
# 抖音cookies配置项
# odin_tt=xxx;sessionid_ss=xxx;ttwid=xxx;passport_csrf_token=xxx;msToken=xxx;
DouYinCookies: ttwid=1%7C3UtF-qneFEBjIHPIIyqSkXqZX2ck8oxwacwbxFPtXeo%7C1711063795%7C2754df694da7a7bbdf125168d5945218a9b8ff284b558cfa19117dc4dce3fdc3; IsDouyinActive=true; home_can_add_dy_2_desktop=%220%22; dy_swidth=1835; dy_sheight=1147; stream_recommend_feed_params=%22%7B%5C%22cookie_enabled%5C%22%3Atrue%2C%5C%22screen_width%5C%22%3A1835%2C%5C%22screen_height%5C%22%3A1147%2C%5C%22browser_online%5C%22%3Atrue%2C%5C%22cpu_core_num%5C%22%3A16%2C%5C%22device_memory%5C%22%3A0%2C%5C%22downlink%5C%22%3A%5C%22%5C%22%2C%5C%22effective_type%5C%22%3A%5C%22%5C%22%2C%5C%22round_trip_time%5C%22%3A0%7D%22; strategyABtestKey=%221712719612.217%22; msToken=oxHVRRQWlfcMd0qxo8wiAj1tJIHFabhq1XOh9DVvQ58oxBwbCGnBwQHxp-haC4mw5OQJi91v2pox0jKbbE8vgm6iZvymn0ztfGaThiXDglsowW_CG6Uss2phY377eSE=; passport_csrf_token=25b422fb8a1a9bc347f618c90b956abb; passport_csrf_token_default=25b422fb8a1a9bc347f618c90b956abb; bd_ticket_guard_client_web_domain=2; GlobalGuideTimes=%221712569923%7C0%22; odin_tt=d485a667b6241f02ca1d683ea28a86c2ca1b0fdb39b31dd30bf81ff75f33c5a578f022fb719c6429ba8464ab5cbc7537dcddaaf3defda4b26c9e3b37a54272f0ebb24a5021285f8b886ba9825d97af05; n_mh=13KNPUKNEzoW3A4J-OLRxfal2zj1GbF-vJUFPs3WSIY; _bd_ticket_crypt_doamin=2; _bd_ticket_crypt_cookie=66d53989a3f243291aa66d25c9c09cd6; LOGIN_STATUS=1; __security_server_data_status=1; store-region=us; store-region-src=uid; d_ticket=47f85db73b2c2b14359c189c1813a59f8c466; my_rd=2; stream_player_status_params=%22%7B%5C%22is_auto_play%5C%22%3A0%2C%5C%22is_full_screen%5C%22%3A0%2C%5C%22is_full_webscreen%5C%22%3A0%2C%5C%22is_mute%5C%22%3A1%2C%5C%22is_speed%5C%22%3A1%2C%5C%22is_visible%5C%22%3A0%7D%22; __live_version__=%221.1.1.9068%22; live_use_vvc=%22false%22; volume_info=%7B%22isUserMute%22%3Afalse%2C%22isMute%22%3Atrue%2C%22volume%22%3A0.5%7D; FORCE_LOGIN=%7B%22videoConsumedRemainSeconds%22%3A180%2C%22isForcePopClose%22%3A1%7D; xgplayer_user_id=145124503600; s_v_web_id=verify_luop8tbv_378419a4_20c9_f688_db00_a0edb906bd3d; download_guide=%223%2F20240407%2F1%22; SEARCH_RESULT_LIST_TYPE=%22single%22; bd_ticket_guard_client_data=eyJiZC10aWNrZXQtZ3VhcmQtdmVyc2lvbiI6MiwiYmQtdGlja2V0LWd1YXJkLWl0ZXJhdGlvbi12ZXJzaW9uIjoxLCJiZC10aWNrZXQtZ3VhcmQtcmVlLXB1YmxpYy1rZXkiOiJCTFFUdWdBbEg4Q1NxRENRdE9QdnN6K1pSOVBjdnBCOWg5dlp1VDhSRU1qSFFVNEVia2dOYnRHR0pBZFZ3c1hiak5EV01WTjBXd05CWEtSbTBWNDI4eHc9IiwiYmQtdGlja2V0LWd1YXJkLXdlYi12ZXJzaW9uIjoxfQ%3D%3D; tt_scid=yOPz66EFkYWVxEZdzlp2oXz8-93ZXFI3QBoGe-QiJy5FRMrzYTEobhpvfx4a.X9N3954; msToken=6HzG-uXu_cIXQcSmsAlfdTG0FB7nv8rH_rlt7_J_wnxrmkKqUNpkaW_PdESzG56g1WWbwhpACRAA03qYKXm2ghww8R2zNchjHKEQ3P6WIJvfaSkpzgLYEQMcrjKLkIA=; __ac_nonce=0661606fc0043e518c105; __ac_signature=_02B4Z6wo00f01ijouYwAAIDDa9gg7RRXqrYo2b0AAOw6LEFj2PgR7B-bsW-.G1T6BExjYP5wGiFF27ouN.9EpEhOXNiYVANCKknwXm-8Sh1xYqGlQipz2XfVtBsxRgPyLtOatlJjFvKY0n7vda
Web_API: # web_api.py
# API链接 如http://127.0.0.1:2333 或 http://api.douyin.wtf (末尾不要留斜杠)
# API link, such as: http://127.0.0.1:2333 or http://api.douyin.wtf (no slash at the end)
Domain: "http://api.douyin.wtf"
# 限制API的请求次数/Limited API requests
Rate_Limit: "10/minute"
# API默认运行端口/Default port of API
Port: 8000
# 默认下载目录/Default download directory
Download_Path: "./download"
# 是否开启下载[tag = Download]功能(默认开启,关闭后无法下载)/Whether to enable the download [tag = Download] function (default open, closed after download)
Download_Switch: true
# 是否自动清理下载目录/Whether to automatically clean up the download directory
Download_Path_Clean_Switch: true
# 下载文件夹自动删除时间(单位:秒)/Download folder automatic deletion time (unit: seconds)
Download_Path_Clean_Timer: 3600
# 默认下载文件名前缀/Default download file name prefix
File_Name_Prefix: "api.douyin.wtf_"
# 是否记录API调用日志/Whether to record API call logs
Allow_Logs: true
# 快捷指令版本/Shortcut version
iOS_Shortcut_Version: "6.0"
# 快捷指令Link(Chinese_Language)
iOS_Shortcut_Link: "https://www.icloud.com/shortcuts/4465d514869e4ca585074d40328f3e0e"
# Shortcut Link(English_Language)
iOS_Shortcut_Link_EN: "https://www.icloud.com/shortcuts/58e3a2cbac784a6782f1031c6b1dd9f8"
# 快捷指令更新时间/Shortcut update time
iOS_Shortcut_Update_Time: "2022/11/06"
# 快捷指令更新记录/Shortcut update log
iOS_Shortcut_Update_Note: "重新适配https://api.douyin.wtf(API-V1 3.0.0版本)"
# iOS shortcut update note
iOS_Shortcut_Update_Note_EN: "Re-adapt https://api.douyin.wtf (API-V1 3.0.0 version)"
Web_APP: # web_app.py
# 网页默认运行端口/Web default running port
Port: 80
# PyWebIO是否使用CDN来获取前端的静态资源(防止CDN被墙导致无法正常显示)
# Whether PyWebIO uses CDN to obtain static resources of the front end (to prevent CDN from being blocked and displayed normally)
PyWebIO_CDN: true
# 最大接受提交URL的数量/Maximum number of URLs accepted for submission
Max_Take_URLs: 200
# 是否记录错误日志/Whether to record error logs
Allow_Logs: true
# 网页标题
Web_Title: "TikTok/抖音无水印在线解析下载"
# Web Title English
Web_Title_English: "Douyin/TikTok online parsing and download without watermark"
# 网页描述
Web_Description: "在线批量解析TikTok/抖音视频和图片,支持无水印下载,官方数据接口,稳定,开源,免费,无广告。"
# Web Description English
Web_Description_English: "Online batch parsing of TikTok/Douyin videos and pictures, support for no watermark download, official data interface, stable, open source, free, no ads."
# 网页关键词/Keywords of the web page
Keywords: "抖音,tiktok,水印,无水印,no-watermark,抖音去水印,tiktok no watermark,在线,online,api,快捷指令,shortcut,下载,解析,parsing,tiktok api,抖音api,抖音去水印在线,tiktok去水印在线,downloader,下载器,free api,免费api"

349
crawlers/base_crawler.py Normal file
View File

@ -0,0 +1,349 @@
# ==============================================================================
# Copyright (C) 2021 Evil0ctal
#
# This file is part of the Douyin_TikTok_Download_API project.
#
# This project is licensed under the Apache License 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at:
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
#         __
#        />  フ
#       |  _  _ l
#       ` ミ_x
#      /      | Feed me Stars ⭐
#     /  ヽ   ノ
#     │  | | |
#  / ̄|   | | |
#  | ( ̄ヽ__ヽ_)__)
#  \二つ
# ==============================================================================
#
# Contributor Link:
# - https://github.com/Evil0ctal
# - https://github.com/Johnserf-Seed
#
# ==============================================================================
import httpx
import json
import asyncio
import re
from httpx import Response
from crawlers.utils.logger import logger
from crawlers.utils.api_exceptions import (
APIError,
APIConnectionError,
APIResponseError,
APITimeoutError,
APIUnavailableError,
APIUnauthorizedError,
APINotFoundError,
APIRateLimitError,
APIRetryExhaustedError,
)
class BaseCrawler:
"""
基础爬虫客户端 (Base crawler client)
"""
def __init__(
self,
proxies: dict = None,
max_retries: int = 3,
max_connections: int = 50,
timeout: int = 10,
max_tasks: int = 50,
crawler_headers: dict = {},
):
if isinstance(proxies, dict):
self.proxies = proxies
# [f"{k}://{v}" for k, v in proxies.items()]
else:
self.proxies = None
# 爬虫请求头 / Crawler request header
self.crawler_headers = crawler_headers or {}
# 异步的任务数 / Number of asynchronous tasks
self._max_tasks = max_tasks
self.semaphore = asyncio.Semaphore(max_tasks)
# 限制最大连接数 / Limit the maximum number of connections
self._max_connections = max_connections
self.limits = httpx.Limits(max_connections=max_connections)
# 业务逻辑重试次数 / Business logic retry count
self._max_retries = max_retries
# 底层连接重试次数 / Underlying connection retry count
self.atransport = httpx.AsyncHTTPTransport(retries=max_retries)
# 超时等待时间 / Timeout waiting time
self._timeout = timeout
self.timeout = httpx.Timeout(timeout)
# 异步客户端 / Asynchronous client
self.aclient = httpx.AsyncClient(
headers=self.crawler_headers,
proxies=self.proxies,
timeout=self.timeout,
limits=self.limits,
transport=self.atransport,
)
async def fetch_response(self, endpoint: str) -> Response:
"""获取数据 (Get data)
Args:
endpoint (str): 接口地址 (Endpoint URL)
Returns:
Response: 原始响应对象 (Raw response object)
"""
return await self.get_fetch_data(endpoint)
async def fetch_get_json(self, endpoint: str) -> dict:
"""获取 JSON 数据 (Get JSON data)
Args:
endpoint (str): 接口地址 (Endpoint URL)
Returns:
dict: 解析后的JSON数据 (Parsed JSON data)
"""
response = await self.get_fetch_data(endpoint)
return self.parse_json(response)
async def fetch_post_json(self, endpoint: str, params: dict = {}, data=None) -> dict:
"""获取 JSON 数据 (Post JSON data)
Args:
endpoint (str): 接口地址 (Endpoint URL)
Returns:
dict: 解析后的JSON数据 (Parsed JSON data)
"""
response = await self.post_fetch_data(endpoint, params, data)
return self.parse_json(response)
def parse_json(self, response: Response) -> dict:
"""解析JSON响应对象 (Parse JSON response object)
Args:
response (Response): 原始响应对象 (Raw response object)
Returns:
dict: 解析后的JSON数据 (Parsed JSON data)
"""
if (
response is not None
and isinstance(response, Response)
and response.status_code == 200
):
try:
return response.json()
except json.JSONDecodeError as e:
# 尝试使用正则表达式匹配response.text中的json数据
match = re.search(r"\{.*\}", response.text)
try:
return json.loads(match.group())
except json.JSONDecodeError as e:
logger.error("解析 {0} 接口 JSON 失败: {1}".format(response.url, e))
raise APIResponseError("解析JSON数据失败")
else:
if isinstance(response, Response):
logger.error(
"获取数据失败。状态码: {0}".format(response.status_code)
)
else:
logger.error("无效响应类型。响应类型: {0}".format(type(response)))
raise APIResponseError("获取数据失败")
async def get_fetch_data(self, url: str):
"""
获取GET端点数据 (Get GET endpoint data)
Args:
url (str): 端点URL (Endpoint URL)
Returns:
response: 响应内容 (Response content)
"""
for attempt in range(self._max_retries):
try:
response = await self.aclient.get(url, follow_redirects=True)
if not response.text.strip() or not response.content:
error_message = "{0} 次响应内容为空, 状态码: {1}, URL:{2}".format(attempt + 1,
response.status_code,
response.url)
logger.warning(error_message)
if attempt == self._max_retries - 1:
raise APIRetryExhaustedError(
"获取端点数据失败, 次数达到上限"
)
await asyncio.sleep(self._timeout)
continue
# logger.info("响应状态码: {0}".format(response.status_code))
response.raise_for_status()
return response
except httpx.RequestError:
raise APIConnectionError("连接端点失败,检查网络环境或代理:{0} 代理:{1} 类名:{2}"
.format(url, self.proxies, self.__class__.__name__)
)
except httpx.HTTPStatusError as http_error:
self.handle_http_status_error(http_error, url, attempt + 1)
except APIError as e:
e.display_error()
async def post_fetch_data(self, url: str, params: dict = {}, data=None):
"""
获取POST端点数据 (Get POST endpoint data)
Args:
url (str): 端点URL (Endpoint URL)
params (dict): POST请求参数 (POST request parameters)
Returns:
response: 响应内容 (Response content)
"""
for attempt in range(self._max_retries):
try:
response = await self.aclient.post(
url,
json=None if not params else dict(params),
data=None if not data else data,
follow_redirects=True
)
if not response.text.strip() or not response.content:
error_message = "{0} 次响应内容为空, 状态码: {1}, URL:{2}".format(attempt + 1,
response.status_code,
response.url)
logger.warning(error_message)
if attempt == self._max_retries - 1:
raise APIRetryExhaustedError(
"获取端点数据失败, 次数达到上限"
)
await asyncio.sleep(self._timeout)
continue
# logger.info("响应状态码: {0}".format(response.status_code))
response.raise_for_status()
return response
except httpx.RequestError:
raise APIConnectionError(
"连接端点失败,检查网络环境或代理:{0} 代理:{1} 类名:{2}".format(url, self.proxies,
self.__class__.__name__)
)
except httpx.HTTPStatusError as http_error:
self.handle_http_status_error(http_error, url, attempt + 1)
except APIError as e:
e.display_error()
async def head_fetch_data(self, url: str):
"""
获取HEAD端点数据 (Get HEAD endpoint data)
Args:
url (str): 端点URL (Endpoint URL)
Returns:
response: 响应内容 (Response content)
"""
try:
response = await self.aclient.head(url)
# logger.info("响应状态码: {0}".format(response.status_code))
response.raise_for_status()
return response
except httpx.RequestError:
raise APIConnectionError("连接端点失败,检查网络环境或代理:{0} 代理:{1} 类名:{2}".format(
url, self.proxies, self.__class__.__name__
)
)
except httpx.HTTPStatusError as http_error:
self.handle_http_status_error(http_error, url, 1)
except APIError as e:
e.display_error()
def handle_http_status_error(self, http_error, url: str, attempt):
"""
处理HTTP状态错误 (Handle HTTP status error)
Args:
http_error: HTTP状态错误 (HTTP status error)
url: 端点URL (Endpoint URL)
attempt: 尝试次数 (Number of attempts)
Raises:
APIConnectionError: 连接端点失败 (Failed to connect to endpoint)
APIResponseError: 响应错误 (Response error)
APIUnavailableError: 服务不可用 (Service unavailable)
APINotFoundError: 端点不存在 (Endpoint does not exist)
APITimeoutError: 连接超时 (Connection timeout)
APIUnauthorizedError: 未授权 (Unauthorized)
APIRateLimitError: 请求频率过高 (Request frequency is too high)
APIRetryExhaustedError: 重试次数达到上限 (The number of retries has reached the upper limit)
"""
response = getattr(http_error, "response", None)
status_code = getattr(response, "status_code", None)
if response is None or status_code is None:
logger.error("HTTP状态错误: {0}, URL: {1}, 尝试次数: {2}".format(
http_error, url, attempt
)
)
raise APIResponseError(f"处理HTTP错误时遇到意外情况: {http_error}")
if status_code == 302:
pass
elif status_code == 404:
raise APINotFoundError(f"HTTP Status Code {status_code}")
elif status_code == 503:
raise APIUnavailableError(f"HTTP Status Code {status_code}")
elif status_code == 408:
raise APITimeoutError(f"HTTP Status Code {status_code}")
elif status_code == 401:
raise APIUnauthorizedError(f"HTTP Status Code {status_code}")
elif status_code == 429:
raise APIRateLimitError(f"HTTP Status Code {status_code}")
else:
logger.error("HTTP状态错误: {0}, URL: {1}, 尝试次数: {2}".format(
status_code, url, attempt
)
)
raise APIResponseError(f"HTTP状态错误: {status_code}")
async def close(self):
await self.aclient.aclose()
async def __aenter__(self):
return self
async def __aexit__(self, exc_type, exc_val, exc_tb):
await self.aclient.aclose()

View File

@ -0,0 +1,23 @@
TokenManager:
douyin:
headers:
Accept-Language: zh-CN,zh;q=0.8,zh-TW;q=0.7,zh-HK;q=0.5,en-US;q=0.3,en;q=0.2
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.212 Safari/537.36
Referer: https://www.douyin.com/
Cookie: __ac_nonce=066230555003b69f3d0f8; __ac_signature=_02B4Z6wo00f01w7tO1AAAIDCBMvg-bBoGdMOzT.AAKWia3; ttwid=1%7CFqN4u6-bqw9dKIb4BPG9sJ3uK6liJe-2wLmYxLyKSjo%7C1713571158%7C6ff5e572886adbccc6888596e8c93ead09164ab0553e38b54fcf2266e9f3975f; douyin.com; device_web_cpu_core=32; device_web_memory_size=8; architecture=amd64; IsDouyinActive=true; dy_swidth=1463; dy_sheight=915; stream_recommend_feed_params=%22%7B%5C%22cookie_enabled%5C%22%3Atrue%2C%5C%22screen_width%5C%22%3A1463%2C%5C%22screen_height%5C%22%3A915%2C%5C%22browser_online%5C%22%3Atrue%2C%5C%22cpu_core_num%5C%22%3A32%2C%5C%22device_memory%5C%22%3A8%2C%5C%22downlink%5C%22%3A10%2C%5C%22effective_type%5C%22%3A%5C%224g%5C%22%2C%5C%22round_trip_time%5C%22%3A50%7D%22; strategyABtestKey=%221713571161.726%22; stream_player_status_params=%22%7B%5C%22is_auto_play%5C%22%3A0%2C%5C%22is_full_screen%5C%22%3A0%2C%5C%22is_full_webscreen%5C%22%3A0%2C%5C%22is_mute%5C%22%3A1%2C%5C%22is_speed%5C%22%3A1%2C%5C%22is_visible%5C%22%3A1%7D%22; volume_info=%7B%22isUserMute%22%3Afalse%2C%22isMute%22%3Atrue%2C%22volume%22%3A0.5%7D; csrf_session_id=6f34e666e71445c9d39d8d06a347a13f; FORCE_LOGIN=%7B%22videoConsumedRemainSeconds%22%3A180%7D; passport_csrf_token=5173fb1efb9b16ee61c56dcee57331e4; passport_csrf_token_default=5173fb1efb9b16ee61c56dcee57331e4; bd_ticket_guard_client_data=eyJiZC10aWNrZXQtZ3VhcmQtdmVyc2lvbiI6MiwiYmQtdGlja2V0LWd1YXJkLWl0ZXJhdGlvbi12ZXJzaW9uIjoxLCJiZC10aWNrZXQtZ3VhcmQtcmVlLXB1YmxpYy1rZXkiOiJCTjNWTnlqalFKamc4SmNCS1dFcjdyRmd1bDNXSGhwaFJhbU5jMldhYnRGUis3VGxhbVlyaTJDMlpSbDRjek1QRXlaSTFlQlowQUhrMkQyOGNoS08xdTg9IiwiYmQtdGlja2V0LWd1YXJkLXdlYi12ZXJzaW9uIjoxfQ%3D%3D; bd_ticket_guard_client_web_domain=2; home_can_add_dy_2_desktop=%221%22; odin_tt=f8fd6095f16034d5e61156f2c0ba9fe3a16e158a40aa0586d8cf4a134558489d2ea591bb5beca05b1b561004c75273c5ccc9fe868f102dd137aab43336d9a8f9df0ac6298c783d03e446b7f2ca4ce154; msToken=lKfc9qATIFKJ9_UT6GpjnH3aXhZh6x5pUp4pknx9OmVJazLzv_VRZUnCDpU9nLzDeZPZpd4E3imehA9UicUakm6NVArFul8oi4-3YE4IB66t3Zw1hpkcZDQFbs1egY8=
proxies:
http:
https:
msToken:
url: https://mssdk.bytedance.com/web/report
magic: 538969122
version: 1
dataType: 8
strData: fWOdJTQR3/jwmZqBBsPO6tdNEc1jX7YTwPg0Z8CT+j3HScLFbj2Zm1XQ7/lqgSutntVKLJWaY3Hc/+vc0h+So9N1t6EqiImu5jKyUa+S4NPy6cNP0x9CUQQgb4+RRihCgsn4QyV8jivEFOsj3N5zFQbzXRyOV+9aG5B5EAnwpn8C70llsWq0zJz1VjN6y2KZiBZRyonAHE8feSGpwMDeUTllvq6BG3AQZz7RrORLWNCLEoGzM6bMovYVPRAJipuUML4Hq/568bNb5vqAo0eOFpvTZjQFgbB7f/CtAYYmnOYlvfrHKBKvb0TX6AjYrw2qmNNEer2ADJosmT5kZeBsogDui8rNiI/OOdX9PVotmcSmHOLRfw1cYXTgwHXr6cJeJveuipgwtUj2FNT4YCdZfUGGyRDz5bR5bdBuYiSRteSX12EktobsKPksdhUPGGv99SI1QRVmR0ETdWqnKWOj/7ujFZsNnfCLxNfqxQYEZEp9/U01CHhWLVrdzlrJ1v+KJH9EA4P1Wo5/2fuBFVdIz2upFqEQ11DJu8LSyD43qpTok+hFG3Moqrr81uPYiyPHnUvTFgwA/TIE11mTc/pNvYIb8IdbE4UAlsR90eYvPkI+rK9KpYN/l0s9ti9sqTth12VAw8tzCQvhKtxevJRQntU3STeZ3coz9Dg8qkvaSNFWuBDuyefZBGVSgILFdMy33//l/eTXhQpFrVc9OyxDNsG6cvdFwu7trkAENHU5eQEWkFSXBx9Ml54+fa3LvJBoacfPViyvzkJworlHcYYTG392L4q6wuMSSpYUconb+0c5mwqnnLP6MvRdm/bBTaY2Q6RfJcCxyLW0xsJMO6fgLUEjAg/dcqGxl6gDjUVRWbCcG1NAwPCfmYARTuXQYbFc8LO+r6WQTWikO9Q7Cgda78pwH07F8bgJ8zFBbWmyrghilNXENNQkyIzBqOQ1V3w0WXF9+Z3vG3aBKCjIENqAQM9qnC14WMrQkfCHosGbQyEH0n/5R2AaVTE/ye2oPQBWG1m0Gfcgs/96f6yYrsxbDcSnMvsA+okyd6GfWsdZYTIK1E97PYHlncFeOjxySjPpfy6wJc4UlArJEBZYmgveo1SZAhmXl3pJY3yJa9CmYImWkhbpwsVkSmG3g11JitJXTGLIfqKXSAhh+7jg4HTKe+5KNir8xmbBI/DF8O/+diFAlD+BQd3cV0G4mEtCiPEhOvVLKV1pE+fv7nKJh0t38wNVdbs3qHtiQNN7JhY4uWZAosMuBXSjpEtoNUndI+o0cjR8XJ8tSFnrAY8XihiRzLMfeisiZxWCvVwIP3kum9MSHXma75cdCQGFBfFRj0jPn1JildrTh2vRgwG+KeDZ33BJ2VGw9PgRkztZ2l/W5d32jc7H91FftFFhwXil6sA23mr6nNp6CcrO7rOblcm5SzXJ5MA601+WVicC/g3p6A0lAnhjsm37qP+xGT+cbCFOfjexDYEhnqz0QZm94CCSnilQ9B/HBLhWOddp9GK0SABIk5i3xAH701Xb4HCcgAulvfO5EK0RL2eN4fb+CccgZQeO1Zzo4qsMHc13UG0saMgBEH8SqYlHz2S0CVHuDY5j1MSV0nsShjM01vIynw6K0T8kmEyNjt1eRGlleJ5lvE8vonJv7rAeaVRZ06rlYaxrMT6cK3RSHd2liE50Z3ik3xezwWoaY6zBXvCzljyEmqjNFgAPU3gI+N1vi0MsFmwAwFzYqqWdk3jwRoWLp//FnawQX0g5T64CnfAe/o2e/8o5/bvz83OsAAwZoR48GZzPu7KCIN9q4GBjyrePNx5Csq2srblifmzSKwF5MP/RLYsk6mEE15jpCMKOVlHcu0zhJybNP3AKMVllF6pvn+HWvUnLXNkt0A6zsfvjAva/tbLQiiiYi6vtheasIyDz3HpODlI+BCkV6V8lkTt7m8QJ1IcgTfqjQBummyjYTSwsQji3DdNCnlKYd13ZQa545utqu837FFAzOZQhbnC3bKqeJqO2sE3m7WBUMbRWLflPRqp/PsklN+9jBPADKxKPl8g6/NZVq8fB1w68D5EJlGExdDhglo4B0aihHhb1u3+zJ2DqkxkPCGBAZ2AcuFIDzD53yS4NssoWb4HJ7YyzPaJro+tgG9TshWRBtUw8Or3m0OtQtX+rboYn3+GxvD1O8vWInrg5qxnepelRcQzmnor4rHF6ZNhAJZAf18Rjncra00HPJBugY5rD+EwnN9+mGQo43b01qBBRYEnxy9JJYuvXxNXxe47/MEPOw6qsxN+dmyIWZSuzkw8K+iBM/anE11yfU4qTFt0veCaVprK6tXaFK0ZhGXDOYJd70sjIP4UrPhatp8hqIXSJ2cwi70B+TvlDk/o19CA3bH6YxrAAVeag1P9hmNlfJ7NxK3Jp7+Ny1Vd7JHWVF+R6rSJiXXPfsXi3ZEy0klJAjI51NrDAnzNtgIQf0V8OWeEVv7F8Rsm3/GKnjdNOcDKymi9agZUgtctENWbCXGFnI40NHuVHtBRZeYAYtwfV7v6U0bP9s7uZGpkp+OETHMv3AyV0MVbZwQvarnjmct4Z3Vma+DvT+Z4VlMVnkC2x2FLt26K3SIMz+KV2XLv5ocEdPFSn1vMR7zruCWC8XqAG288biHo/soldmb/nlw8o8qlfZj4h296K3hfdFubGIUtqgsrZCrLCkkRC08Cv1ozEX/y6t2YrQepwiNmwDVk5IufStVvJMj+y2r9TcYLv7UKWXx3P6aySvM2ZHPaZhv+6Z/A/jIMBSvOizn4qG11iK7Oo6JYhxCSMJZsetjsnL4ecSIAufEmoFlAScWBh6nFArRpVLvkAZ3tej7H2lWFRXIU7x7mdBfGqU82PpM6znKMMZCpEsvHqpkSPSL+Kwz2z1f5wW7BKcKK4kNZ8iveg9VzY1NNjs91qU8DJpUnGyM04C7KNMpeilEmoOxvyelMQdi85ndOVmigVKmy5JYlODNX744sHpeqmMEK/ux3xY5O406lm7dZlyGPSMrFWbm4rzqvSEIskP43+9xVP8L84GeHE4RpOHg3qh/shx+/WnT1UhKuKpByHCpLoEo144udpzZswCYSMp58uPrlwdVF31//AacTRk8dUP3tBlnSQPa1eTpXWFCn7vIiqOTXaRL//YQK+e7ssrgSUnwhuGKJ8aqNDgdsL+haVZnV9g5Qrju643adyNixvYFEp0uxzOzVkekOMh2FYnFVIL2mJYGpZEXlAIC0zQbb54rSP89j0G7soJ2HcOkD0NmMEWj/7hUdTuMin1lRNde/qmHjwhbhqL8Z9MEO/YG3iLMgFTgSNQQhyE8AZAAKnehmzjORJfbK+qxyiJ07J843EDduzOoYt9p/YLqyTFmAgpdfK0uYrtAJ47cbl5WWhVXp5/XUxwWdL7TvQB0Xh6ir1/XBRcsVSDrR7cPE221ThmW1EPzD+SPf2L2gS0WromZqj1PhLgk92YnnR9s7/nLBXZHPKy+fDbJT16QqabFKqAl9G0blyf+R5UGX2kN+iQp4VGXEoH5lXxNNTlgRskzrW7KliQXcac20oimAHUE8Phf+rXXglpmSv4XN3eiwfXwvOaAMVjMRmRxsKitl5iZnwpcdbsC4jt16g2r/ihlKzLIYju+XZej4dNMlkftEidyNg24IVimJthXY1H15RZ8Hm7mAM/JZrsxiAVI0A49pWEiUk3cyZcBzq/vVEjHUy4r6IZnKkRvLjqsvqWE95nAGMor+F0GLHWfBCVkuI51EIOknwSB1eTvLgwgRepV4pdy9cdp6iR8TZndPVCikflXYVMlMEJ2bJ2c0Swiq57ORJW6vQwnkxtPudpFRc7tNNDzz4LKEznJxAwGi6pBR7/co2IUgRw1ijLFTHWHQJOjgc7KaduHI0C6a+BJb4Y8IWuIk2u2qCMF1HNKFAUn/J1gTcqtIJcvK5uykpfJFCYc899TmUc8LMKI9nu57m0S44Y2hPPYeW4XSakScsg8bJHMkcXk3Tbs9b4eqiD+kHUhTS2BGfsHadR3d5j8lNhBPzA5e+mE==
User-Agent: 5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36 Edg/117.0.2045.47
ttwid:
url: https://ttwid.bytedance.com/ttwid/union/register/
data: '{"region":"cn","aid":1768,"needFid":false,"service":"www.ixigua.com","migrate_info":{"ticket":"","source":"node"},"cbUrlProtocol":"https","union":true}'

View File

@ -0,0 +1,152 @@
class DouyinAPIEndpoints:
"""
API Endpoints for Douyin
"""
# 抖音域名 (Douyin Domain)
DOUYIN_DOMAIN = "https://www.douyin.com"
# 抖音短域名 (Short Domain)
IESDOUYIN_DOMAIN = "https://www.iesdouyin.com"
# 直播域名 (Live Domain)
LIVE_DOMAIN = "https://live.douyin.com"
# 直播域名2 (Live Domain 2)
LIVE_DOMAIN2 = "https://webcast.amemv.com"
# SSO域名 (SSO Domain)
SSO_DOMAIN = "https://sso.douyin.com"
# WSS域名 (WSS Domain)
WEBCAST_WSS_DOMAIN = "wss://webcast5-ws-web-lf.douyin.com"
# 首页Feed (Home Feed)
TAB_FEED = f"{DOUYIN_DOMAIN}/aweme/v1/web/tab/feed/"
# 用户短信息 (User Short Info)
USER_SHORT_INFO = f"{DOUYIN_DOMAIN}/aweme/v1/web/im/user/info/"
# 用户详细信息 (User Detail Info)
USER_DETAIL = f"{DOUYIN_DOMAIN}/aweme/v1/web/user/profile/other/"
# 作品基本 (Post Basic)
BASE_AWEME = f"{DOUYIN_DOMAIN}/aweme/v1/web/aweme/"
# 用户作品 (User Post)
USER_POST = f"{DOUYIN_DOMAIN}/aweme/v1/web/aweme/post/"
# 定位作品 (Post Local)
LOCATE_POST = f"{DOUYIN_DOMAIN}/aweme/v1/web/locate/post/"
# 综合搜索 (General Search)
GENERAL_SEARCH = f"{DOUYIN_DOMAIN}/aweme/v1/web/general/search/single/"
# 视频搜索 (Video Search)
VIDEO_SEARCH = f"{DOUYIN_DOMAIN}/aweme/v1/web/search/item/"
# 用户搜索 (User Search)
USER_SEARCH = f"{DOUYIN_DOMAIN}/aweme/v1/web/discover/search/"
# 直播间搜索 (Live Search)
LIVE_SEARCH = f"{DOUYIN_DOMAIN}/aweme/v1/web/live/search/"
# 作品信息 (Post Detail)
POST_DETAIL = f"{DOUYIN_DOMAIN}/aweme/v1/web/aweme/detail/"
# 单个作品视频弹幕数据 (Post Danmaku)
POST_DANMAKU = f"{DOUYIN_DOMAIN}/aweme/v1/web/danmaku/get_v2/"
# 用户喜欢A (User Like A)
USER_FAVORITE_A = f"{DOUYIN_DOMAIN}/aweme/v1/web/aweme/favorite/"
# 用户喜欢B (User Like B)
USER_FAVORITE_B = f"{IESDOUYIN_DOMAIN}/web/api/v2/aweme/like/"
# 关注用户(User Following)
USER_FOLLOWING = f"{DOUYIN_DOMAIN}/aweme/v1/web/user/following/list/"
# 粉丝用户 (User Follower)
USER_FOLLOWER = f"{DOUYIN_DOMAIN}/aweme/v1/web/user/follower/list/"
# 合集作品
MIX_AWEME = f"{DOUYIN_DOMAIN}/aweme/v1/web/mix/aweme/"
# 用户历史 (User History)
USER_HISTORY = f"{DOUYIN_DOMAIN}/aweme/v1/web/history/read/"
# 用户收藏 (User Collection)
USER_COLLECTION = f"{DOUYIN_DOMAIN}/aweme/v1/web/aweme/listcollection/"
# 用户收藏夹 (User Collects)
USER_COLLECTS = f"{DOUYIN_DOMAIN}/aweme/v1/web/collects/list/"
# 用户收藏夹作品 (User Collects Posts)
USER_COLLECTS_VIDEO = f"{DOUYIN_DOMAIN}/aweme/v1/web/collects/video/list/"
# 用户音乐收藏 (User Music Collection)
USER_MUSIC_COLLECTION = f"{DOUYIN_DOMAIN}/aweme/v1/web/music/listcollection/"
# 首页朋友作品 (Friend Feed)
FRIEND_FEED = f"{DOUYIN_DOMAIN}/aweme/v1/web/familiar/feed/"
# 关注用户作品 (Follow Feed)
FOLLOW_FEED = f"{DOUYIN_DOMAIN}/aweme/v1/web/follow/feed/"
# 相关推荐 (Related Feed)
POST_RELATED = f"{DOUYIN_DOMAIN}/aweme/v1/web/aweme/related/"
# 关注用户列表直播 (Follow User Live)
FOLLOW_USER_LIVE = f"{DOUYIN_DOMAIN}/webcast/web/feed/follow/"
# 直播信息接口 (Live Info)
LIVE_INFO = f"{LIVE_DOMAIN}/webcast/room/web/enter/"
# 直播信息接口2 (Live Info 2)
LIVE_INFO_ROOM_ID = f"{LIVE_DOMAIN2}/webcast/room/reflow/info/"
# 直播间送礼用户排行榜 (Live Gift Rank)
LIVE_GIFT_RANK = f"{LIVE_DOMAIN}/webcast/ranklist/audience/"
# 直播用户信息 (Live User Info)
LIVE_USER_INFO = f"{LIVE_DOMAIN}/webcast/user/me/"
# 推荐搜索词 (Suggest Words)
SUGGEST_WORDS = f"{DOUYIN_DOMAIN}/aweme/v1/web/api/suggest_words/"
# SSO登录 (SSO Login)
SSO_LOGIN_GET_QR = f"{SSO_DOMAIN}/get_qrcode/"
# 登录检查 (Login Check)
SSO_LOGIN_CHECK_QR = f"{SSO_DOMAIN}/check_qrconnect/"
# 登录确认 (Login Confirm)
SSO_LOGIN_CHECK_LOGIN = f"{SSO_DOMAIN}/check_login/"
# 登录重定向 (Login Redirect)
SSO_LOGIN_REDIRECT = f"{DOUYIN_DOMAIN}/login/"
# 登录回调 (Login Callback)
SSO_LOGIN_CALLBACK = f"{DOUYIN_DOMAIN}/passport/sso/login/callback/"
# 作品评论 (Post Comment)
POST_COMMENT = f"{DOUYIN_DOMAIN}/aweme/v1/web/comment/list/"
# 评论回复 (Comment Reply)
POST_COMMENT_REPLY = f"{DOUYIN_DOMAIN}/aweme/v1/web/comment/list/reply/"
# 回复评论 (Reply Comment)
POST_COMMENT_PUBLISH = f"{DOUYIN_DOMAIN}/aweme/v1/web/comment/publish"
# 删除评论 (Delete Comment)
POST_COMMENT_DELETE = f"{DOUYIN_DOMAIN}/aweme/v1/web/comment/delete/"
# 点赞评论 (Like Comment)
POST_COMMENT_DIGG = f"{DOUYIN_DOMAIN}/aweme/v1/web/comment/digg"
# 抖音热榜数据 (Douyin Hot Search)
DOUYIN_HOT_SEARCH = f"{DOUYIN_DOMAIN}/aweme/v1/web/hot/search/list/"
# 抖音视频频道 (Douyin Video Channel)
DOUYIN_VIDEO_CHANNEL = f"{DOUYIN_DOMAIN}/aweme/v1/web/channel/feed/"

View File

@ -0,0 +1,287 @@
from typing import Any, List
from pydantic import BaseModel, Field
from crawlers.douyin.web.utils import TokenManager, VerifyFpManager
# Base Model
class BaseRequestModel(BaseModel):
device_platform: str = "webapp"
aid: str = "6383"
channel: str = "channel_pc_web"
pc_client_type: int = 1
version_code: str = "190500"
version_name: str = "19.5.0"
cookie_enabled: str = "true"
screen_width: int = 1920
screen_height: int = 1080
browser_language: str = "zh-CN"
browser_platform: str = "Win32"
browser_name: str = "Firefox"
browser_version: str = "124.0"
browser_online: str = "true"
engine_name: str = "Gecko"
engine_version: str = "122.0.0.0"
os_name: str = "Windows"
os_version: str = "10"
cpu_core_num: int = 12
device_memory: int = 8
platform: str = "PC"
# downlink: int = 10
# effective_type: str = "4g"
# round_trip_time: int = 100
msToken: str = TokenManager.gen_real_msToken()
class BaseLiveModel(BaseModel):
aid: str = "6383"
app_name: str = "douyin_web"
live_id: int = 1
device_platform: str = "web"
language: str = "zh-CN"
cookie_enabled: str = "true"
screen_width: int = 1920
screen_height: int = 1080
browser_language: str = "zh-CN"
browser_platform: str = "Win32"
browser_name: str = "Edge"
browser_version: str = "119.0.0.0"
enter_source: Any = ""
is_need_double_stream: str = "false"
# msToken: str = TokenManager.gen_real_msToken()
# _signature: str = ''
class BaseLiveModel2(BaseModel):
verifyFp: str = VerifyFpManager.gen_verify_fp()
type_id: str = "0"
live_id: str = "1"
sec_user_id: str = ""
version_code: str = "99.99.99"
app_id: str = "1128"
msToken: str = TokenManager.gen_real_msToken()
class BaseLoginModel(BaseModel):
service: str = "https://www.douyin.com"
need_logo: str = "false"
need_short_url: str = "true"
device_platform: str = "web_app"
aid: str = "6383"
account_sdk_source: str = "sso"
sdk_version: str = "2.2.7-beta.6"
language: str = "zh"
# Model
class UserProfile(BaseRequestModel):
sec_user_id: str
class UserPost(BaseRequestModel):
max_cursor: int
count: int
sec_user_id: str
# 获取单个作品视频弹幕数据
class PostDanmaku(BaseRequestModel):
item_id: str
duration: int
end_time: int
start_time: int = 0
class UserLike(BaseRequestModel):
max_cursor: int
count: int
sec_user_id: str
class UserCollection(BaseRequestModel):
# POST
cursor: int
count: int
class UserCollects(BaseRequestModel):
# GET
cursor: int
count: int
class UserCollectsVideo(BaseRequestModel):
# GET
cursor: int
count: int
collects_id: str
class UserMusicCollection(BaseRequestModel):
# GET
cursor: int
count: int
class UserMix(BaseRequestModel):
cursor: int
count: int
mix_id: str
class FriendFeed(BaseRequestModel):
cursor: int = 0
level: int = 1
aweme_ids: str = ""
room_ids: str = ""
pull_type: int = 0
address_book_access: int = 2
gps_access: int = 2
recent_gids: str = ""
class PostFeed(BaseRequestModel):
count: int = 10
tag_id: str = ""
share_aweme_id: str = ""
live_insert_type: str = ""
refresh_index: int = 1
video_type_select: int = 1
aweme_pc_rec_raw_data: dict = {} # {"is_client":false}
globalwid: str = ""
pull_type: str = ""
min_window: str = ""
free_right: str = ""
ug_source: str = ""
creative_id: str = ""
class FollowFeed(BaseRequestModel):
cursor: int = 0
level: int = 1
count: int = 20
pull_type: str = ""
class PostRelated(BaseRequestModel):
aweme_id: str
count: int = 20
filterGids: str # id,id,id
awemePcRecRawData: dict = {} # {"is_client":false}
sub_channel_id: int = 3
# Seo-Flag: int = 0
class PostDetail(BaseRequestModel):
aweme_id: str
class PostComments(BaseRequestModel):
aweme_id: str
cursor: int = 0
count: int = 20
item_type: int = 0
insert_ids: str = ""
whale_cut_token: str = ""
cut_version: int = 1
rcFT: str = ""
class PostCommentsReply(BaseRequestModel):
item_id: str
comment_id: str
cursor: int = 0
count: int = 20
item_type: int = 0
class PostLocate(BaseRequestModel):
sec_user_id: str
max_cursor: str # last max_cursor
locate_item_id: str = "" # aweme_id
locate_item_cursor: str
locate_query: str = "true"
count: int = 10
publish_video_strategy_type: int = 2
class UserLive(BaseLiveModel):
web_rid: str
room_id_str: str
# 直播间送礼用户排行榜
class LiveRoomRanking(BaseRequestModel):
webcast_sdk_version: int = 2450
room_id: int
# anchor_id: int
# sec_anchor_id: str
rank_type: int = 30
class UserLive2(BaseLiveModel2):
room_id: str
class FollowUserLive(BaseRequestModel):
scene: str = "aweme_pc_follow_top"
class SuggestWord(BaseRequestModel):
query: str = ""
count: int = 8
business_id: str
from_group_id: str
rsp_source: str = ""
penetrate_params: dict = {}
class LoginGetQr(BaseLoginModel):
verifyFp: str = ""
fp: str = ""
# msToken: str = TokenManager.gen_real_msToken()
class LoginCheckQr(BaseLoginModel):
token: str = ""
verifyFp: str = ""
fp: str = ""
# msToken: str = TokenManager.gen_real_msToken()
class UserFollowing(BaseRequestModel):
user_id: str = ""
sec_user_id: str = ""
offset: int = 0 # 相当于cursor
min_time: int = 0
max_time: int = 0
count: int = 20
# source_type = 1: 最近关注 需要指定max_time(s) 3: 最早关注 需要指定min_time(s) 4: 综合排序
source_type: int = 4
gps_access: int = 0
address_book_access: int = 0
is_top: int = 1
class UserFollower(BaseRequestModel):
user_id: str
sec_user_id: str
offset: int = 0 # 相当于cursor 但只对source_type: = 2 有效,其他情况为 0 即可
min_time: int = 0
max_time: int = 0
count: int = 20
# source_type = 1: 最近关注 需要指定max_time(s) 2: 综合关注(意义不明)
source_type: int = 1
gps_access: int = 0
address_book_access: int = 0
is_top: int = 1
# 列表作品
class URL_List(BaseModel):
urls: List[str] = [
"https://test.example.com/xxxxx/",
"https://test.example.com/yyyyy/",
"https://test.example.com/zzzzz/"
]

View File

@ -0,0 +1,763 @@
# ==============================================================================
# Copyright (C) 2021 Evil0ctal
#
# This file is part of the Douyin_TikTok_Download_API project.
#
# This project is licensed under the Apache License 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at:
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
#         __
#        />  フ
#       |  _  _ l
#       ` ミ_x
#      /      | Feed me Stars ⭐
#     /  ヽ   ノ
#     │  | | |
#  / ̄|   | | |
#  | ( ̄ヽ__ヽ_)__)
#  \二つ
# ==============================================================================
#
# Contributor Link:
# - https://github.com/Evil0ctal
# - https://github.com/Johnserf-Seed
#
# ==============================================================================
import re
import json
import time
import httpx
import qrcode
import random
import asyncio
import yaml
from typing import Union
from pathlib import Path
from crawlers.utils.logger import logger
from crawlers.utils.utils import (
gen_random_str,
get_timestamp,
extract_valid_urls,
split_filename,
)
from crawlers.utils.api_exceptions import (
APIError,
APIConnectionError,
APIResponseError,
APIUnavailableError,
APIUnauthorizedError,
APINotFoundError,
)
from crawlers.douyin.web.xbogus import XBogus as XB
from urllib.parse import quote
import os
# 配置文件路径
# Read the configuration file
path = os.path.abspath(os.path.dirname(__file__))
# 读取配置文件
with open(f"{path}/config.yaml", "r", encoding="utf-8") as f:
config = yaml.safe_load(f)
class TokenManager:
douyin_manager = config.get("TokenManager").get("douyin")
token_conf = douyin_manager.get("msToken", None)
ttwid_conf = douyin_manager.get("ttwid", None)
proxies_conf = douyin_manager.get("proxies", None)
proxies = {
"http://": proxies_conf.get("http", None),
"https://": proxies_conf.get("https", None),
}
@classmethod
def gen_real_msToken(cls) -> str:
"""
生成真实的msToken,当出现错误时返回虚假的值
(Generate a real msToken and return a false value when an error occurs)
"""
payload = json.dumps(
{
"magic": cls.token_conf["magic"],
"version": cls.token_conf["version"],
"dataType": cls.token_conf["dataType"],
"strData": cls.token_conf["strData"],
"tspFromClient": get_timestamp(),
}
)
headers = {
"User-Agent": cls.token_conf["User-Agent"],
"Content-Type": "application/json",
}
transport = httpx.HTTPTransport(retries=5)
with httpx.Client(transport=transport, proxies=cls.proxies) as client:
try:
response = client.post(
cls.token_conf["url"], content=payload, headers=headers
)
response.raise_for_status()
msToken = str(httpx.Cookies(response.cookies).get("msToken"))
if len(msToken) not in [120, 128]:
raise APIResponseError("{0} 内容不符合要求".format("msToken"))
return msToken
except httpx.RequestError as exc:
# 捕获所有与 httpx 请求相关的异常情况 (Captures all httpx request-related exceptions)
raise APIConnectionError(
"请求端点失败,请检查当前网络环境。 链接:{0},代理:{1},异常类名:{2},异常详细信息:{3}"
.format(cls.token_conf["url"], cls.proxies, cls.__name__, exc)
)
except httpx.HTTPStatusError as e:
# 捕获 httpx 的状态代码错误 (captures specific status code errors from httpx)
if e.response.status_code == 401:
raise APIUnauthorizedError(
"参数验证失败,请更新 F2 配置文件中的 {0},以匹配 {1} 新规则"
.format("msToken", "douyin")
)
elif e.response.status_code == 404:
raise APINotFoundError("{0} 无法找到API端点".format("msToken"))
else:
raise APIResponseError(
"链接:{0},状态码 {1}{2} ".format(
e.response.url, e.response.status_code, e.response.text
)
)
except APIError as e:
# 返回虚假的msToken (Return a fake msToken)
logger.error("msToken API错误{0}".format(e))
logger.info("生成虚假的msToken")
return cls.gen_false_msToken()
@classmethod
def gen_false_msToken(cls) -> str:
"""生成随机msToken (Generate random msToken)"""
return gen_random_str(126) + "=="
@classmethod
def gen_ttwid(cls) -> str:
"""
生成请求必带的ttwid
(Generate the essential ttwid for requests)
"""
transport = httpx.HTTPTransport(retries=5)
with httpx.Client(transport=transport) as client:
try:
response = client.post(
cls.ttwid_conf["url"], content=cls.ttwid_conf["data"]
)
response.raise_for_status()
ttwid = str(httpx.Cookies(response.cookies).get("ttwid"))
return ttwid
except httpx.RequestError as exc:
# 捕获所有与 httpx 请求相关的异常情况 (Captures all httpx request-related exceptions)
raise APIConnectionError(
"请求端点失败,请检查当前网络环境。 链接:{0},代理:{1},异常类名:{2},异常详细信息:{3}"
.format(cls.ttwid_conf["url"], cls.proxies, cls.__name__, exc)
)
except httpx.HTTPStatusError as e:
# 捕获 httpx 的状态代码错误 (captures specific status code errors from httpx)
if e.response.status_code == 401:
raise APIUnauthorizedError(
"参数验证失败,请更新 F2 配置文件中的 {0},以匹配 {1} 新规则"
.format("ttwid", "douyin")
)
elif e.response.status_code == 404:
raise APINotFoundError("ttwid无法找到API端点")
else:
raise APIResponseError("链接:{0},状态码 {1}{2} ".format(
e.response.url, e.response.status_code, e.response.text
)
)
class VerifyFpManager:
@classmethod
def gen_verify_fp(cls) -> str:
"""
生成verifyFp s_v_web_id (Generate verifyFp)
"""
base_str = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
t = len(base_str)
milliseconds = int(round(time.time() * 1000))
base36 = ""
while milliseconds > 0:
remainder = milliseconds % 36
if remainder < 10:
base36 = str(remainder) + base36
else:
base36 = chr(ord("a") + remainder - 10) + base36
milliseconds = int(milliseconds / 36)
r = base36
o = [""] * 36
o[8] = o[13] = o[18] = o[23] = "_"
o[14] = "4"
for i in range(36):
if not o[i]:
n = 0 or int(random.random() * t)
if i == 19:
n = 3 & n | 8
o[i] = base_str[n]
return "verify_" + r + "_" + "".join(o)
@classmethod
def gen_s_v_web_id(cls) -> str:
return cls.gen_verify_fp()
class BogusManager:
@classmethod
def xb_str_2_endpoint(cls, endpoint: str, user_agent: str) -> str:
try:
final_endpoint = XB(user_agent).getXBogus(endpoint)
except Exception as e:
raise RuntimeError("生成X-Bogus失败: {0})".format(e))
return final_endpoint[0]
@classmethod
def xb_model_2_endpoint(cls, base_endpoint: str, params: dict, user_agent: str) -> str:
if not isinstance(params, dict):
raise TypeError("参数必须是字典类型")
param_str = "&".join([f"{k}={v}" for k, v in params.items()])
try:
xb_value = XB(user_agent).getXBogus(param_str)
except Exception as e:
raise RuntimeError("生成X-Bogus失败: {0})".format(e))
# 检查base_endpoint是否已有查询参数 (Check if base_endpoint already has query parameters)
separator = "&" if "?" in base_endpoint else "?"
final_endpoint = f"{base_endpoint}{separator}{param_str}&X-Bogus={xb_value[1]}"
return final_endpoint
class SecUserIdFetcher:
# 预编译正则表达式
_DOUYIN_URL_PATTERN = re.compile(r"user/([^/?]*)")
_REDIRECT_URL_PATTERN = re.compile(r"sec_uid=([^&]*)")
@classmethod
async def get_sec_user_id(cls, url: str) -> str:
"""
从单个url中获取sec_user_id (Get sec_user_id from a single url)
Args:
url (str): 输入的url (Input url)
Returns:
str: 匹配到的sec_user_id (Matched sec_user_id)
"""
if not isinstance(url, str):
raise TypeError("参数必须是字符串类型")
# 提取有效URL
url = extract_valid_urls(url)
if url is None:
raise (
APINotFoundError("输入的URL不合法。类名{0}".format(cls.__name__))
)
pattern = (
cls._REDIRECT_URL_PATTERN
if "v.douyin.com" in url
else cls._DOUYIN_URL_PATTERN
)
try:
transport = httpx.AsyncHTTPTransport(retries=5)
async with httpx.AsyncClient(
transport=transport, proxies=TokenManager.proxies, timeout=10
) as client:
response = await client.get(url, follow_redirects=True)
# 444一般为Nginx拦截不返回状态 (444 is generally intercepted by Nginx and does not return status)
if response.status_code in {200, 444}:
match = pattern.search(str(response.url))
if match:
return match.group(1)
else:
raise APIResponseError(
"未在响应的地址中找到sec_user_id检查链接是否为用户主页类名{0}"
.format(cls.__name__)
)
elif response.status_code == 401:
raise APIUnauthorizedError("未授权的请求。类名:{0}".format(cls.__name__)
)
elif response.status_code == 404:
raise APINotFoundError("未找到API端点。类名{0}".format(cls.__name__)
)
elif response.status_code == 503:
raise APIUnavailableError("API服务不可用。类名{0}".format(cls.__name__)
)
else:
raise APIResponseError("链接:{0},状态码 {1}{2} ".format(
response.url, response.status_code, response.text
)
)
except httpx.RequestError as exc:
raise APIConnectionError("请求端点失败,请检查当前网络环境。 链接:{0},代理:{1},异常类名:{2},异常详细信息:{3}"
.format(url, TokenManager.proxies, cls.__name__, exc)
)
@classmethod
async def get_all_sec_user_id(cls, urls: list) -> list:
"""
获取列表sec_user_id列表 (Get list sec_user_id list)
Args:
urls: list: 用户url列表 (User url list)
Return:
sec_user_ids: list: 用户sec_user_id列表 (User sec_user_id list)
"""
if not isinstance(urls, list):
raise TypeError("参数必须是列表类型")
# 提取有效URL
urls = extract_valid_urls(urls)
if urls == []:
raise (
APINotFoundError("输入的URL List不合法。类名{0}".format(cls.__name__)
)
)
sec_user_ids = [cls.get_sec_user_id(url) for url in urls]
return await asyncio.gather(*sec_user_ids)
class AwemeIdFetcher:
# 预编译正则表达式
_DOUYIN_VIDEO_URL_PATTERN = re.compile(r"video/([^/?]*)")
_DOUYIN_NOTE_URL_PATTERN = re.compile(r"note/([^/?]*)")
_DOUYIN_DISCOVER_URL_PATTERN = re.compile(r"modal_id=([0-9]+)")
@classmethod
async def get_aweme_id(cls, url: str) -> str:
"""
从单个url中获取aweme_id (Get aweme_id from a single url)
Args:
url (str): 输入的url (Input url)
Returns:
str: 匹配到的aweme_id (Matched aweme_id)
"""
if not isinstance(url, str):
raise TypeError("参数必须是字符串类型")
# 提取有效URL
url = extract_valid_urls(url)
if url is None:
raise (
APINotFoundError("输入的URL不合法。类名{0}".format(cls.__name__))
)
# 重定向到完整链接
transport = httpx.AsyncHTTPTransport(retries=5)
async with httpx.AsyncClient(
transport=transport, proxies=TokenManager.proxies, timeout=10
) as client:
try:
response = await client.get(url, follow_redirects=True)
response.raise_for_status()
video_pattern = cls._DOUYIN_VIDEO_URL_PATTERN
note_pattern = cls._DOUYIN_NOTE_URL_PATTERN
discover_pattern = cls._DOUYIN_DISCOVER_URL_PATTERN
# 2024-4-22
# 嵌套如果超过3层需要修改此处代码 (If the nesting exceeds 3 layers, you need to modify this code)
match = video_pattern.search(str(response.url))
if video_pattern.search(str(response.url)):
aweme_id = match.group(1)
else:
match = note_pattern.search(str(response.url))
if match:
aweme_id = match.group(1)
else:
match = discover_pattern.search(str(response.url))
if match:
aweme_id = match.group(1)
else:
raise APIResponseError(
"未在响应的地址中找到aweme_id检查链接是否为作品页"
)
return aweme_id
except httpx.RequestError as exc:
# 捕获所有与 httpx 请求相关的异常情况 (Captures all httpx request-related exceptions)
raise APIConnectionError("请求端点失败,请检查当前网络环境。 链接:{0},代理:{1},异常类名:{2},异常详细信息:{3}"
.format(url, TokenManager.proxies, cls.__name__, exc)
)
except httpx.HTTPStatusError as e:
raise APIResponseError("链接:{0},状态码 {1}{2} ".format(
e.response.url, e.response.status_code, e.response.text
)
)
@classmethod
async def get_all_aweme_id(cls, urls: list) -> list:
"""
获取视频aweme_id,传入列表url都可以解析出aweme_id (Get video aweme_id, pass in the list url can parse out aweme_id)
Args:
urls: list: 列表url (list url)
Return:
aweme_ids: list: 视频的唯一标识返回列表 (The unique identifier of the video, return list)
"""
if not isinstance(urls, list):
raise TypeError("参数必须是列表类型")
# 提取有效URL
urls = extract_valid_urls(urls)
if urls == []:
raise (
APINotFoundError("输入的URL List不合法。类名{0}".format(cls.__name__)
)
)
aweme_ids = [cls.get_aweme_id(url) for url in urls]
return await asyncio.gather(*aweme_ids)
class MixIdFetcher:
# 获取方法同AwemeIdFetcher
@classmethod
async def get_mix_id(cls, url: str) -> str:
return
class WebCastIdFetcher:
# 预编译正则表达式
_DOUYIN_LIVE_URL_PATTERN = re.compile(r"live/([^/?]*)")
# https://live.douyin.com/766545142636?cover_type=0&enter_from_merge=web_live&enter_method=web_card&game_name=&is_recommend=1&live_type=game&more_detail=&request_id=20231110224012D47CD00C18B4AE4BFF9B&room_id=7299828646049827596&stream_type=vertical&title_type=1&web_live_page=hot_live&web_live_tab=all
# https://live.douyin.com/766545142636
_DOUYIN_LIVE_URL_PATTERN2 = re.compile(r"http[s]?://live.douyin.com/(\d+)")
# https://webcast.amemv.com/douyin/webcast/reflow/7318296342189919011?u_code=l1j9bkbd&did=MS4wLjABAAAAEs86TBQPNwAo-RGrcxWyCdwKhI66AK3Pqf3ieo6HaxI&iid=MS4wLjABAAAA0ptpM-zzoliLEeyvWOCUt-_dQza4uSjlIvbtIazXnCY&with_sec_did=1&use_link_command=1&ecom_share_track_params=&extra_params={"from_request_id":"20231230162057EC005772A8EAA0199906","im_channel_invite_id":"0"}&user_id=3644207898042206&liveId=7318296342189919011&from=share&style=share&enter_method=click_share&roomId=7318296342189919011&activity_info={}
_DOUYIN_LIVE_URL_PATTERN3 = re.compile(r"reflow/([^/?]*)")
@classmethod
async def get_webcast_id(cls, url: str) -> str:
"""
从单个url中获取webcast_id (Get webcast_id from a single url)
Args:
url (str): 输入的url (Input url)
Returns:
str: 匹配到的webcast_id (Matched webcast_id)
"""
if not isinstance(url, str):
raise TypeError("参数必须是字符串类型")
# 提取有效URL
url = extract_valid_urls(url)
if url is None:
raise (
APINotFoundError("输入的URL不合法。类名{0}".format(cls.__name__))
)
try:
# 重定向到完整链接
transport = httpx.AsyncHTTPTransport(retries=5)
async with httpx.AsyncClient(
transport=transport, proxies=TokenManager.proxies, timeout=10
) as client:
response = await client.get(url, follow_redirects=True)
response.raise_for_status()
url = str(response.url)
live_pattern = cls._DOUYIN_LIVE_URL_PATTERN
live_pattern2 = cls._DOUYIN_LIVE_URL_PATTERN2
live_pattern3 = cls._DOUYIN_LIVE_URL_PATTERN3
if live_pattern.search(url):
match = live_pattern.search(url)
elif live_pattern2.search(url):
match = live_pattern2.search(url)
elif live_pattern3.search(url):
match = live_pattern3.search(url)
logger.warning("该链接返回的是room_id请使用`fetch_user_live_videos_by_room_id`接口"
)
else:
raise APIResponseError("未在响应的地址中找到webcast_id检查链接是否为直播页"
)
return match.group(1)
except httpx.RequestError as exc:
# 捕获所有与 httpx 请求相关的异常情况 (Captures all httpx request-related exceptions)
raise APIConnectionError("请求端点失败,请检查当前网络环境。 链接:{0},代理:{1},异常类名:{2},异常详细信息:{3}"
.format(url, TokenManager.proxies, cls.__name__, exc)
)
except httpx.HTTPStatusError as e:
raise APIResponseError("链接:{0},状态码 {1}{2} ".format(
e.response.url, e.response.status_code, e.response.text
)
)
@classmethod
async def get_all_webcast_id(cls, urls: list) -> list:
"""
获取直播webcast_id,传入列表url都可以解析出webcast_id (Get live webcast_id, pass in the list url can parse out webcast_id)
Args:
urls: list: 列表url (list url)
Return:
webcast_ids: list: 直播的唯一标识返回列表 (The unique identifier of the live, return list)
"""
if not isinstance(urls, list):
raise TypeError("参数必须是列表类型")
# 提取有效URL
urls = extract_valid_urls(urls)
if urls == []:
raise (
APINotFoundError("输入的URL List不合法。类名{0}".format(cls.__name__)
)
)
webcast_ids = [cls.get_webcast_id(url) for url in urls]
return await asyncio.gather(*webcast_ids)
def format_file_name(
naming_template: str,
aweme_data: dict = {},
custom_fields: dict = {},
) -> str:
"""
根据配置文件的全局格式化文件名
(Format file name according to the global conf file)
Args:
aweme_data (dict): 抖音数据的字典 (dict of douyin data)
naming_template (str): 文件的命名模板, "{create}_{desc}" (Naming template for files, such as "{create}_{desc}")
custom_fields (dict): 用户自定义字段, 用于替代默认的字段值 (Custom fields for replacing default field values)
Note:
windows 文件名长度限制为 255 个字符, 开启了长文件名支持后为 32,767 个字符
(Windows file name length limit is 255 characters, 32,767 characters after long file name support is enabled)
Unix 文件名长度限制为 255 个字符
(Unix file name length limit is 255 characters)
取去除后的50个字符, 加上后缀, 一般不会超过255个字符
(Take the removed 50 characters, add the suffix, and generally not exceed 255 characters)
详细信息请参考: https://en.wikipedia.org/wiki/Filename#Length
(For more information, please refer to: https://en.wikipedia.org/wiki/Filename#Length)
Returns:
str: 格式化的文件名 (Formatted file name)
"""
# 为不同系统设置不同的文件名长度限制
os_limit = {
"win32": 200,
"cygwin": 60,
"darwin": 60,
"linux": 60,
}
fields = {
"create": aweme_data.get("create_time", ""), # 长度固定19
"nickname": aweme_data.get("nickname", ""), # 最长30
"aweme_id": aweme_data.get("aweme_id", ""), # 长度固定19
"desc": split_filename(aweme_data.get("desc", ""), os_limit),
"uid": aweme_data.get("uid", ""), # 固定11
}
if custom_fields:
# 更新自定义字段
fields.update(custom_fields)
try:
return naming_template.format(**fields)
except KeyError as e:
raise KeyError("文件名模板字段 {0} 不存在,请检查".format(e))
def create_user_folder(kwargs: dict, nickname: Union[str, int]) -> Path:
"""
根据提供的配置文件和昵称创建对应的保存目录
(Create the corresponding save directory according to the provided conf file and nickname.)
Args:
kwargs (dict): 配置文件字典格式(Conf file, dict format)
nickname (Union[str, int]): 用户的昵称允许字符串或整数 (User nickname, allow strings or integers)
Note:
如果未在配置文件中指定路径则默认为 "Download"
(If the path is not specified in the conf file, it defaults to "Download".)
支持绝对与相对路径
(Support absolute and relative paths)
Raises:
TypeError: 如果 kwargs 不是字典格式将引发 TypeError
(If kwargs is not in dict format, TypeError will be raised.)
"""
# 确定函数参数是否正确
if not isinstance(kwargs, dict):
raise TypeError("kwargs 参数必须是字典")
# 创建基础路径
base_path = Path(kwargs.get("path", "Download"))
# 添加下载模式和用户名
user_path = (
base_path / "douyin" / kwargs.get("mode", "PLEASE_SETUP_MODE") / str(nickname)
)
# 获取绝对路径并确保它存在
resolve_user_path = user_path.resolve()
# 创建目录
resolve_user_path.mkdir(parents=True, exist_ok=True)
return resolve_user_path
def rename_user_folder(old_path: Path, new_nickname: str) -> Path:
"""
重命名用户目录 (Rename User Folder).
Args:
old_path (Path): 旧的用户目录路径 (Path of the old user folder)
new_nickname (str): 新的用户昵称 (New user nickname)
Returns:
Path: 重命名后的用户目录路径 (Path of the renamed user folder)
"""
# 获取目标目录的父目录 (Get the parent directory of the target folder)
parent_directory = old_path.parent
# 构建新目录路径 (Construct the new directory path)
new_path = old_path.rename(parent_directory / new_nickname).resolve()
return new_path
def create_or_rename_user_folder(
kwargs: dict, local_user_data: dict, current_nickname: str
) -> Path:
"""
创建或重命名用户目录 (Create or rename user directory)
Args:
kwargs (dict): 配置参数 (Conf parameters)
local_user_data (dict): 本地用户数据 (Local user data)
current_nickname (str): 当前用户昵称 (Current user nickname)
Returns:
user_path (Path): 用户目录路径 (User directory path)
"""
user_path = create_user_folder(kwargs, current_nickname)
if not local_user_data:
return user_path
if local_user_data.get("nickname") != current_nickname:
# 昵称不一致,触发目录更新操作
user_path = rename_user_folder(user_path, current_nickname)
return user_path
def show_qrcode(qrcode_url: str, show_image: bool = False) -> None:
"""
显示二维码 (Show QR code)
Args:
qrcode_url (str): 登录二维码链接 (Login QR code link)
show_image (bool): 是否显示图像True 表示显示False 表示在控制台显示
(Whether to display the image, True means display, False means display in the console)
"""
if show_image:
# 创建并显示QR码图像
qr_code_img = qrcode.make(qrcode_url)
qr_code_img.show()
else:
# 在控制台以 ASCII 形式打印二维码
qr = qrcode.QRCode()
qr.add_data(qrcode_url)
qr.make(fit=True)
# 在控制台以 ASCII 形式打印二维码
qr.print_ascii(invert=True)
def json_2_lrc(data: Union[str, list, dict]) -> str:
"""
从抖音原声json格式歌词生成lrc格式歌词
(Generate lrc lyrics format from Douyin original json lyrics format)
Args:
data (Union[str, list, dict]): 抖音原声json格式歌词 (Douyin original json lyrics format)
Returns:
str: 生成的lrc格式歌词 (Generated lrc format lyrics)
"""
try:
lrc_lines = []
for item in data:
text = item["text"]
time_seconds = float(item["timeId"])
minutes = int(time_seconds // 60)
seconds = int(time_seconds % 60)
milliseconds = int((time_seconds % 1) * 1000)
time_str = f"{minutes:02}:{seconds:02}.{milliseconds:03}"
lrc_lines.append(f"[{time_str}] {text}")
except KeyError as e:
raise KeyError("歌词数据字段错误:{0}".format(e))
except RuntimeError as e:
raise RuntimeError("生成歌词文件失败:{0},请检查歌词 `data` 内容".format(e))
except TypeError as e:
raise TypeError("歌词数据类型错误:{0}".format(e))
return "\n".join(lrc_lines)

View File

@ -0,0 +1,486 @@
# ==============================================================================
# Copyright (C) 2021 Evil0ctal
#
# This file is part of the Douyin_TikTok_Download_API project.
#
# This project is licensed under the Apache License 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at:
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
#         __
#        />  フ
#       |  _  _ l
#       ` ミ_x
#      /      | Feed me Stars ⭐
#     /  ヽ   ノ
#     │  | | |
#  / ̄|   | | |
#  | ( ̄ヽ__ヽ_)__)
#  \二つ
# ==============================================================================
#
# Contributor Link:
# - https://github.com/Evil0ctal
# - https://github.com/Johnserf-Seed
#
# ==============================================================================
import asyncio # 异步I/O
import time # 时间操作
import yaml # 配置文件
import os # 系统操作
# 基础爬虫客户端和抖音API端点
from crawlers.base_crawler import BaseCrawler
from crawlers.douyin.web.endpoints import DouyinAPIEndpoints
# 抖音应用的工具类
from crawlers.douyin.web.utils import (AwemeIdFetcher, # Aweme ID获取
BogusManager, # XBogus管理
SecUserIdFetcher, # 安全用户ID获取
TokenManager, # 令牌管理
VerifyFpManager, # 验证管理
WebCastIdFetcher, # 直播ID获取
extract_valid_urls # URL提取
)
# 抖音接口数据请求模型
from crawlers.douyin.web.models import (
BaseRequestModel, LiveRoomRanking, PostComments,
PostCommentsReply, PostDanmaku, PostDetail,
UserProfile, UserCollection, UserLike, UserLive,
UserLive2, UserMix, UserPost
)
# 配置文件路径
path = os.path.abspath(os.path.dirname(__file__))
# 读取配置文件
with open(f"{path}/config.yaml", "r", encoding="utf-8") as f:
config = yaml.safe_load(f)
class DouyinWebCrawler:
# 从配置文件中获取抖音的请求头
async def get_douyin_headers(self):
douyin_config = config["TokenManager"]["douyin"]
kwargs = {
"headers": {
"Accept-Language": douyin_config["headers"]["Accept-Language"],
"User-Agent": douyin_config["headers"]["User-Agent"],
"Referer": douyin_config["headers"]["Referer"],
"Cookie": douyin_config["headers"]["Cookie"],
},
"proxies": {"http://": douyin_config["proxies"]["http"], "https://": douyin_config["proxies"]["https"]},
}
return kwargs
"-------------------------------------------------------handler接口列表-------------------------------------------------------"
# 获取单个作品数据
async def fetch_one_video(self, aweme_id: str):
# 获取抖音的实时Cookie
kwargs = await self.get_douyin_headers()
# 创建一个基础爬虫
base_crawler = BaseCrawler(proxies=kwargs["proxies"], crawler_headers=kwargs["headers"])
async with base_crawler as crawler:
# 创建一个作品详情的BaseModel参数
params = PostDetail(aweme_id=aweme_id)
# 生成一个作品详情的带有加密参数的Endpoint
endpoint = BogusManager.xb_model_2_endpoint(
DouyinAPIEndpoints.POST_DETAIL, params.dict(), kwargs["headers"]["User-Agent"]
)
response = await crawler.fetch_get_json(endpoint)
return response
# 获取用户发布作品数据
async def fetch_user_post_videos(self, sec_user_id: str, max_cursor: int, count: int):
kwargs = await self.get_douyin_headers()
base_crawler = BaseCrawler(proxies=kwargs["proxies"], crawler_headers=kwargs["headers"])
async with base_crawler as crawler:
params = UserPost(sec_user_id=sec_user_id, max_cursor=max_cursor, count=count)
endpoint = BogusManager.xb_model_2_endpoint(
DouyinAPIEndpoints.USER_POST, params.dict(), kwargs["headers"]["User-Agent"]
)
response = await crawler.fetch_get_json(endpoint)
return response
# 获取用户喜欢作品数据
async def fetch_user_like_videos(self, sec_user_id: str, max_cursor: int, count: int):
kwargs = await self.get_douyin_headers()
base_crawler = BaseCrawler(proxies=kwargs["proxies"], crawler_headers=kwargs["headers"])
async with base_crawler as crawler:
params = UserLike(sec_user_id=sec_user_id, max_cursor=max_cursor, count=count)
endpoint = BogusManager.xb_model_2_endpoint(
DouyinAPIEndpoints.USER_FAVORITE_A, params.dict(), kwargs["headers"]["User-Agent"]
)
response = await crawler.fetch_get_json(endpoint)
return response
# 获取用户收藏作品数据用户提供自己的Cookie
async def fetch_user_collection_videos(self, cookie: str, cursor: int = 0, count: int = 20):
kwargs = await self.get_douyin_headers()
kwargs["headers"]["Cookie"] = cookie
base_crawler = BaseCrawler(proxies=kwargs["proxies"], crawler_headers=kwargs["headers"])
async with base_crawler as crawler:
params = UserCollection(cursor=cursor, count=count)
endpoint = BogusManager.xb_model_2_endpoint(
DouyinAPIEndpoints.USER_COLLECTION, params.dict(), kwargs["headers"]["User-Agent"]
)
response = await crawler.fetch_post_json(endpoint)
return response
# 获取用户合辑作品数据
async def fetch_user_mix_videos(self, mix_id: str, cursor: int = 0, count: int = 20):
kwargs = await self.get_douyin_headers()
base_crawler = BaseCrawler(proxies=kwargs["proxies"], crawler_headers=kwargs["headers"])
async with base_crawler as crawler:
params = UserMix(mix_id=mix_id, cursor=cursor, count=count)
endpoint = BogusManager.xb_model_2_endpoint(
DouyinAPIEndpoints.MIX_AWEME, params.dict(), kwargs["headers"]["User-Agent"]
)
response = await crawler.fetch_get_json(endpoint)
return response
# 获取用户直播流数据
async def fetch_user_live_videos(self, webcast_id: str, room_id_str=""):
kwargs = await self.get_douyin_headers()
base_crawler = BaseCrawler(proxies=kwargs["proxies"], crawler_headers=kwargs["headers"])
async with base_crawler as crawler:
params = UserLive(web_rid=webcast_id, room_id_str=room_id_str)
endpoint = BogusManager.xb_model_2_endpoint(
DouyinAPIEndpoints.LIVE_INFO, params.dict(), kwargs["headers"]["User-Agent"]
)
response = await crawler.fetch_get_json(endpoint)
return response
# 获取指定用户的直播流数据
async def fetch_user_live_videos_by_room_id(self, room_id: str):
kwargs = await self.get_douyin_headers()
base_crawler = BaseCrawler(proxies=kwargs["proxies"], crawler_headers=kwargs["headers"])
async with base_crawler as crawler:
params = UserLive2(room_id=room_id)
endpoint = BogusManager.xb_model_2_endpoint(
DouyinAPIEndpoints.LIVE_INFO_ROOM_ID, params.dict(), kwargs["headers"]["User-Agent"]
)
response = await crawler.fetch_get_json(endpoint)
return response
# 获取直播间送礼用户排行榜
async def fetch_live_gift_ranking(self, room_id: str, rank_type: int = 30):
kwargs = await self.get_douyin_headers()
base_crawler = BaseCrawler(proxies=kwargs["proxies"], crawler_headers=kwargs["headers"])
async with base_crawler as crawler:
params = LiveRoomRanking(room_id=room_id, rank_type=rank_type)
endpoint = BogusManager.xb_model_2_endpoint(
DouyinAPIEndpoints.LIVE_GIFT_RANK, params.dict(), kwargs["headers"]["User-Agent"]
)
response = await crawler.fetch_get_json(endpoint)
return response
# 获取指定用户的信息
async def handler_user_profile(self, sec_user_id: str):
kwargs = await self.get_douyin_headers()
base_crawler = BaseCrawler(proxies=kwargs["proxies"], crawler_headers=kwargs["headers"])
async with base_crawler as crawler:
params = UserProfile(sec_user_id=sec_user_id)
endpoint = BogusManager.xb_model_2_endpoint(
DouyinAPIEndpoints.USER_DETAIL, params.dict(), kwargs["headers"]["User-Agent"]
)
response = await crawler.fetch_get_json(endpoint)
return response
# 获取指定视频的评论数据
async def fetch_video_comments(self, aweme_id: str, cursor: int = 0, count: int = 20):
kwargs = await self.get_douyin_headers()
base_crawler = BaseCrawler(proxies=kwargs["proxies"], crawler_headers=kwargs["headers"])
async with base_crawler as crawler:
params = PostComments(aweme_id=aweme_id, cursor=cursor, count=count)
endpoint = BogusManager.xb_model_2_endpoint(
DouyinAPIEndpoints.POST_COMMENT, params.dict(), kwargs["headers"]["User-Agent"]
)
response = await crawler.fetch_get_json(endpoint)
return response
# 获取指定视频的评论回复数据
async def fetch_video_comments_reply(self, item_id: str, comment_id: str, cursor: int = 0, count: int = 20):
kwargs = await self.get_douyin_headers()
base_crawler = BaseCrawler(proxies=kwargs["proxies"], crawler_headers=kwargs["headers"])
async with base_crawler as crawler:
params = PostCommentsReply(item_id=item_id, comment_id=comment_id, cursor=cursor, count=count)
endpoint = BogusManager.xb_model_2_endpoint(
DouyinAPIEndpoints.POST_COMMENT_REPLY, params.dict(), kwargs["headers"]["User-Agent"]
)
response = await crawler.fetch_get_json(endpoint)
return response
# 获取抖音热榜数据
async def fetch_hot_search_result(self):
kwargs = await self.get_douyin_headers()
base_crawler = BaseCrawler(proxies=kwargs["proxies"], crawler_headers=kwargs["headers"])
async with base_crawler as crawler:
params = BaseRequestModel()
endpoint = BogusManager.xb_model_2_endpoint(
DouyinAPIEndpoints.DOUYIN_HOT_SEARCH, params.dict(), kwargs["headers"]["User-Agent"]
)
response = await crawler.fetch_get_json(endpoint)
return response
"-------------------------------------------------------utils接口列表-------------------------------------------------------"
# 生成真实msToken
async def gen_real_msToken(self, ):
result = {
"msToken": TokenManager().gen_real_msToken()
}
return result
# 生成ttwid
async def gen_ttwid(self, ):
result = {
"ttwid": TokenManager().gen_ttwid()
}
return result
# 生成verify_fp
async def gen_verify_fp(self, ):
result = {
"verify_fp": VerifyFpManager.gen_verify_fp()
}
return result
# 生成s_v_web_id
async def gen_s_v_web_id(self, ):
result = {
"s_v_web_id": VerifyFpManager.gen_s_v_web_id()
}
return result
# 使用接口地址生成Xb参数
async def get_x_bogus(self, url: str, user_agent: str):
url = BogusManager.xb_str_2_endpoint(url, user_agent)
result = {
"url": url,
"x_bogus": url.split("&X-Bogus=")[1],
"user_agent": user_agent
}
return result
# 提取单个用户id
async def get_sec_user_id(self, url: str):
return await SecUserIdFetcher.get_sec_user_id(url)
# 提取列表用户id
async def get_all_sec_user_id(self, urls: list):
# 提取有效URL
urls = extract_valid_urls(urls)
# 对于URL列表
return await SecUserIdFetcher.get_all_sec_user_id(urls)
# 提取单个作品id
async def get_aweme_id(self, url: str):
return await AwemeIdFetcher.get_aweme_id(url)
# 提取列表作品id
async def get_all_aweme_id(self, urls: list):
# 提取有效URL
urls = extract_valid_urls(urls)
# 对于URL列表
return await AwemeIdFetcher.get_all_aweme_id(urls)
# 提取单个直播间号
async def get_webcast_id(self, url: str):
return await WebCastIdFetcher.get_webcast_id(url)
# 提取列表直播间号
async def get_all_webcast_id(self, urls: list):
# 提取有效URL
urls = extract_valid_urls(urls)
# 对于URL列表
return await WebCastIdFetcher.get_all_webcast_id(urls)
async def main(self):
"""-------------------------------------------------------handler接口列表-------------------------------------------------------"""
# 获取单一视频信息
# aweme_id = "7345492945006595379"
# result = await self.fetch_one_video(aweme_id)
# print(result)
# 获取用户发布作品数据
# sec_user_id = "MS4wLjABAAAANXSltcLCzDGmdNFI2Q_QixVTr67NiYzjKOIP5s03CAE"
# max_cursor = 0
# count = 10
# result = await self.fetch_user_post_videos(sec_user_id, max_cursor, count)
# print(result)
# 获取用户喜欢作品数据
# 这个账号有点东西,晚点看一下 https://www.douyin.com/user/MS4wLjABAAAAW9FWcqS7RdQAWPd2AA5fL_ilmqsIFUCQ_Iym6Yh9_cUa6ZRqVLjVQSUjlHrfXY1Y
# sec_user_id = "MS4wLjABAAAAW9FWcqS7RdQAWPd2AA5fL_ilmqsIFUCQ_Iym6Yh9_cUa6ZRqVLjVQSUjlHrfXY1Y"
# max_cursor = 0
# count = 10
# result = await self.fetch_user_like_videos(sec_user_id, max_cursor, count)
# print(result)
# 获取用户收藏作品数据用户提供自己的Cookie
# cookie = "带上你的Cookie/Put your Cookie here"
# cursor = 0
# counts = 20
# result = await self.fetch_user_collection_videos(__cookie, cursor, counts)
# print(result)
# 获取用户合辑作品数据
# https://www.douyin.com/collection/7348687990509553679
# mix_id = "7348687990509553679"
# cursor = 0
# counts = 20
# result = await self.fetch_user_mix_videos(mix_id, cursor, counts)
# print(result)
# 获取用户直播流数据
# https://live.douyin.com/285520721194
# webcast_id = "285520721194"
# result = await self.fetch_user_live_videos(webcast_id)
# print(result)
# 获取指定用户的直播流数据
# # https://live.douyin.com/7318296342189919011
# room_id = "7318296342189919011"
# result = await self.fetch_user_live_videos_by_room_id(room_id)
# print(result)
# 获取直播间送礼用户排行榜
# room_id = "7356585666190461731"
# rank_type = 30
# result = await self.fetch_live_gift_ranking(room_id, rank_type)
# print(result)
# 获取指定用户的信息
# sec_user_id = "MS4wLjABAAAAW9FWcqS7RdQAWPd2AA5fL_ilmqsIFUCQ_Iym6Yh9_cUa6ZRqVLjVQSUjlHrfXY1Y"
# result = await self.handler_user_profile(sec_user_id)
# print(result)
# 获取单个视频评论数据
# aweme_id = "7334525738793618688"
# result = await self.fetch_video_comments(aweme_id)
# print(result)
# 获取单个视频评论回复数据
# item_id = "7344709764531686690"
# comment_id = "7346856757471953698"
# result = await self.fetch_video_comments_reply(item_id, comment_id)
# print(result)
# 获取指定关键词的综合搜索结果
# keyword = "中华娘"
# offset = 0
# count = 20
# sort_type = "0"
# publish_time = "0"
# filter_duration = "0"
# result = await self.fetch_general_search_result(keyword, offset, count, sort_type, publish_time, filter_duration)
# print(result)
# 获取抖音热榜数据
# result = await self.fetch_hot_search_result()
# print(result)
"""-------------------------------------------------------utils接口列表-------------------------------------------------------"""
# 生成真实msToken
# result = await self.gen_real_msToken()
# print(result)
# 生成ttwid
# result = await self.gen_ttwid()
# print(result)
# 生成verify_fp
# result = await self.gen_verify_fp()
# print(result)
# 生成s_v_web_id
# result = await self.gen_s_v_web_id()
# print(result)
# 使用接口地址生成Xb参数
# url = "https://www.douyin.com/aweme/v1/web/comment/list/?device_platform=webapp&aid=6383&channel=channel_pc_web&aweme_id=7334525738793618688&cursor=0&count=20&item_type=0&insert_ids=&whale_cut_token=&cut_version=1&rcFT=&pc_client_type=1&version_code=170400&version_name=17.4.0&cookie_enabled=true&screen_width=1344&screen_height=756&browser_language=zh-CN&browser_platform=Win32&browser_name=Firefox&browser_version=124.0&browser_online=true&engine_name=Gecko&engine_version=124.0&os_name=Windows&os_version=10&cpu_core_num=16&device_memory=&platform=PC&webid=7348962975497324070"
# user_agent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.212 Safari/537.36"
# result = await self.get_x_bogus(url, user_agent)
# print(result)
# 提取单个用户id
# raw_url = "https://www.douyin.com/user/MS4wLjABAAAANXSltcLCzDGmdNFI2Q_QixVTr67NiYzjKOIP5s03CAE?vid=7285950278132616463"
# result = await self.get_sec_user_id(raw_url)
# print(result)
# 提取列表用户id
# raw_urls = [
# "https://www.douyin.com/user/MS4wLjABAAAANXSltcLCzDGmdNFI2Q_QixVTr67NiYzjKOIP5s03CAE?vid=7285950278132616463",
# "https://www.douyin.com/user/MS4wLjABAAAAVsneOf144eGDFf8Xp9QNb1VW6ovXnNT5SqJBhJfe8KQBKWKDTWK5Hh-_i9mJzb8C",
# "长按复制此条消息打开抖音搜索查看TA的更多作品。 https://v.douyin.com/idFqvUms/",
# "https://v.douyin.com/idFqvUms/",
# ]
# result = await self.get_all_sec_user_id(raw_urls)
# print(result)
# 提取单个作品id
# raw_url = "https://www.douyin.com/video/7298145681699622182?previous_page=web_code_link"
# result = await self.get_aweme_id(raw_url)
# print(result)
# 提取列表作品id
# raw_urls = [
# "0.53 02/26 I@v.sE Fus:/ 你别太帅了郑润泽# 现场版live # 音乐节 # 郑润泽 https://v.douyin.com/iRNBho6u/ 复制此链接打开Dou音搜索直接观看视频!",
# "https://v.douyin.com/iRNBho6u/",
# "https://www.iesdouyin.com/share/video/7298145681699622182/?region=CN&mid=7298145762238565171&u_code=l1j9bkbd&did=MS4wLjABAAAAtqpCx0hpOERbdSzQdjRZw-wFPxaqdbAzsKDmbJMUI3KWlMGQHC-n6dXAqa-dM2EP&iid=MS4wLjABAAAANwkJuWIRFOzg5uCpDRpMj4OX-QryoDgn-yYlXQnRwQQ&with_sec_did=1&titleType=title&share_sign=05kGlqGmR4_IwCX.ZGk6xuL0osNA..5ur7b0jbOx6cc-&share_version=170400&ts=1699262937&from_aid=6383&from_ssr=1&from=web_code_link",
# "https://www.douyin.com/video/7298145681699622182?previous_page=web_code_link",
# "https://www.douyin.com/video/7298145681699622182",
# ]
# result = await self.get_all_aweme_id(raw_urls)
# print(result)
# 提取单个直播间号
# raw_url = "https://live.douyin.com/775841227732"
# result = await self.get_webcast_id(raw_url)
# print(result)
# 提取列表直播间号
# raw_urls = [
# "https://live.douyin.com/775841227732",
# "https://live.douyin.com/775841227732?room_id=7318296342189919011&enter_from_merge=web_share_link&enter_method=web_share_link&previous_page=app_code_link",
# 'https://webcast.amemv.com/douyin/webcast/reflow/7318296342189919011?u_code=l1j9bkbd&did=MS4wLjABAAAAEs86TBQPNwAo-RGrcxWyCdwKhI66AK3Pqf3ieo6HaxI&iid=MS4wLjABAAAA0ptpM-zzoliLEeyvWOCUt-_dQza4uSjlIvbtIazXnCY&with_sec_did=1&use_link_command=1&ecom_share_track_params=&extra_params={"from_request_id":"20231230162057EC005772A8EAA0199906","im_channel_invite_id":"0"}&user_id=3644207898042206&liveId=7318296342189919011&from=share&style=share&enter_method=click_share&roomId=7318296342189919011&activity_info={}',
# "6i- Q@x.Sl 03/23 【醒子8ke的直播间】 点击打开👉https://v.douyin.com/i8tBR7hX/ 或长按复制此条消息打开抖音看TA直播",
# "https://v.douyin.com/i8tBR7hX/",
# ]
# result = await self.get_all_webcast_id(raw_urls)
# print(result)
# 占位
pass
if __name__ == "__main__":
# 初始化
DouyinWebCrawler = DouyinWebCrawler()
# 开始时间
start = time.time()
asyncio.run(DouyinWebCrawler.main())
# 结束时间
end = time.time()
print(f"耗时:{end - start}")

View File

@ -0,0 +1,248 @@
# ==============================================================================
# Copyright (C) 2021 Evil0ctal
#
# This file is part of the Douyin_TikTok_Download_API project.
#
# This project is licensed under the Apache License 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at:
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
#         __
#        />  フ
#       |  _  _ l
#       ` ミ_x
#      /      | Feed me Stars ⭐
#     /  ヽ   ノ
#     │  | | |
#  / ̄|   | | |
#  | ( ̄ヽ__ヽ_)__)
#  \二つ
# ==============================================================================
#
# Contributor Link:
# - https://github.com/Evil0ctal
# - https://github.com/Johnserf-Seed
#
# ==============================================================================
import time
import base64
import hashlib
class XBogus:
def __init__(self, user_agent: str = None) -> None:
# fmt: off
self.Array = [
None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None,
None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None,
None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None,
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, None, None, None, None, None, None, None, None, None, None, None,
None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None,
None, None, None, None, None, None, None, None, None, None, None, None, 10, 11, 12, 13, 14, 15
]
self.character = "Dkdpgh4ZKsQB80/Mfvw36XI1R25-WUAlEi7NLboqYTOPuzmFjJnryx9HVGcaStCe="
# fmt: on
self.ua_key = b"\x00\x01\x0c"
self.user_agent = (
user_agent
if user_agent is not None and user_agent != ""
else "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/122.0.0.0 Safari/537.36 Edg/122.0.0.0"
)
def md5_str_to_array(self, md5_str):
"""
将字符串使用md5哈希算法转换为整数数组
Convert a string to an array of integers using the md5 hashing algorithm.
"""
if isinstance(md5_str, str) and len(md5_str) > 32:
return [ord(char) for char in md5_str]
else:
array = []
idx = 0
while idx < len(md5_str):
array.append(
(self.Array[ord(md5_str[idx])] << 4)
| self.Array[ord(md5_str[idx + 1])]
)
idx += 2
return array
def md5_encrypt(self, url_path):
"""
使用多轮md5哈希算法对URL路径进行加密
Encrypt the URL path using multiple rounds of md5 hashing.
"""
hashed_url_path = self.md5_str_to_array(
self.md5(self.md5_str_to_array(self.md5(url_path)))
)
return hashed_url_path
def md5(self, input_data):
"""
计算输入数据的md5哈希值
Calculate the md5 hash value of the input data.
"""
if isinstance(input_data, str):
array = self.md5_str_to_array(input_data)
elif isinstance(input_data, list):
array = input_data
else:
raise ValueError("Invalid input type. Expected str or list.")
md5_hash = hashlib.md5()
md5_hash.update(bytes(array))
return md5_hash.hexdigest()
def encoding_conversion(
self, a, b, c, e, d, t, f, r, n, o, i, _, x, u, s, l, v, h, p
):
"""
第一次编码转换
Perform encoding conversion.
"""
y = [a]
y.append(int(i))
y.extend([b, _, c, x, e, u, d, s, t, l, f, v, r, h, n, p, o])
re = bytes(y).decode("ISO-8859-1")
return re
def encoding_conversion2(self, a, b, c):
"""
第二次编码转换
Perform an encoding conversion on the given input values and return the result.
"""
return chr(a) + chr(b) + c
def rc4_encrypt(self, key, data):
"""
使用RC4算法对数据进行加密
Encrypt data using the RC4 algorithm.
"""
S = list(range(256))
j = 0
encrypted_data = bytearray()
# 初始化 S 盒
# Initialize the S box
for i in range(256):
j = (j + S[i] + key[i % len(key)]) % 256
S[i], S[j] = S[j], S[i]
# 生成密文
# Generate the ciphertext
i = j = 0
for byte in data:
i = (i + 1) % 256
j = (j + S[i]) % 256
S[i], S[j] = S[j], S[i]
encrypted_byte = byte ^ S[(S[i] + S[j]) % 256]
encrypted_data.append(encrypted_byte)
return encrypted_data
def calculation(self, a1, a2, a3):
"""
对给定的输入值执行位运算计算并返回结果
Perform a calculation using bitwise operations on the given input values and return the result.
"""
x1 = (a1 & 255) << 16
x2 = (a2 & 255) << 8
x3 = x1 | x2 | a3
return (
self.character[(x3 & 16515072) >> 18]
+ self.character[(x3 & 258048) >> 12]
+ self.character[(x3 & 4032) >> 6]
+ self.character[x3 & 63]
)
def getXBogus(self, url_path):
"""
获取 X-Bogus
Get the X-Bogus value.
"""
array1 = self.md5_str_to_array(
self.md5(
base64.b64encode(
self.rc4_encrypt(self.ua_key, self.user_agent.encode("ISO-8859-1"))
).decode("ISO-8859-1")
)
)
array2 = self.md5_str_to_array(
self.md5(self.md5_str_to_array("d41d8cd98f00b204e9800998ecf8427e"))
)
url_path_array = self.md5_encrypt(url_path)
timer = int(time.time())
ct = 536919696
array3 = []
array4 = []
xb_ = ""
# fmt: off
new_array = [
64, 0.00390625, 1, 12,
url_path_array[14], url_path_array[15], array2[14], array2[15], array1[14], array1[15],
timer >> 24 & 255, timer >> 16 & 255, timer >> 8 & 255, timer & 255,
ct >> 24 & 255, ct >> 16 & 255, ct >> 8 & 255, ct & 255
]
# fmt: on
xor_result = new_array[0]
for i in range(1, len(new_array)):
b = new_array[i]
if isinstance(b, float):
b = int(b)
xor_result ^= b
new_array.append(xor_result)
idx = 0
while idx < len(new_array):
array3.append(new_array[idx])
try:
array4.append(new_array[idx + 1])
except IndexError:
pass
idx += 2
merge_array = array3 + array4
garbled_code = self.encoding_conversion2(
2,
255,
self.rc4_encrypt(
"ÿ".encode("ISO-8859-1"),
self.encoding_conversion(*merge_array).encode("ISO-8859-1"),
).decode("ISO-8859-1"),
)
idx = 0
while idx < len(garbled_code):
xb_ += self.calculation(
ord(garbled_code[idx]),
ord(garbled_code[idx + 1]),
ord(garbled_code[idx + 2]),
)
idx += 3
self.params = "%s&X-Bogus=%s" % (url_path, xb_)
self.xb = xb_
return (self.params, self.xb, self.user_agent)
if __name__ == "__main__":
url_path = "https://www.douyin.com/aweme/v1/web/aweme/post/?device_platform=webapp&aid=6383&channel=channel_pc_web&sec_user_id=MS4wLjABAAAAW9FWcqS7RdQAWPd2AA5fL_ilmqsIFUCQ_Iym6Yh9_cUa6ZRqVLjVQSUjlHrfXY1Y&max_cursor=0&locate_query=false&show_live_replay_strategy=1&need_time_list=1&time_list_query=0&whale_cut_token=&cut_version=1&count=18&publish_video_strategy_type=2&pc_client_type=1&version_code=170400&version_name=17.4.0&cookie_enabled=true&screen_width=1920&screen_height=1080&browser_language=zh-CN&browser_platform=Win32&browser_name=Edge&browser_version=122.0.0.0&browser_online=true&engine_name=Blink&engine_version=122.0.0.0&os_name=Windows&os_version=10&cpu_core_num=12&device_memory=8&platform=PC&downlink=10&effective_type=4g&round_trip_time=50&webid=7335414539335222835&msToken=p9Y7fUBuq9DKvAuN27Peml6JbaMqG2ZcXfFiyDv1jcHrCN00uidYqUgSuLsKl1onC-E_n82m-aKKYE0QGEmxIWZx9iueQ6WLbvzPfqnMk4GBAlQIHcDzxb38FLXXQxAm"
# ua = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/122.0.0.0 Safari/537.36 Edg/122.0.0.0"
ua = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36"
XB = XBogus(user_agent=ua)
xbogus = XB.getXBogus(url_path)
print(f"url: {xbogus[0]}, xbogus:{xbogus[1]}, ua: {xbogus[2]}")

View File

@ -0,0 +1,173 @@
import asyncio
from crawlers.douyin.web.web_crawler import DouyinWebCrawler # 导入抖音Web爬虫
from crawlers.tiktok.web.web_crawler import TikTokWebCrawler # 导入TikTok Web爬虫
from crawlers.tiktok.app.app_crawler import TikTokAPPCrawler # 导入TikTok App爬虫
class HybridCrawler:
def __init__(self):
self.DouyinWebCrawler = DouyinWebCrawler()
self.TikTokWebCrawler = TikTokWebCrawler()
self.TikTokAPPCrawler = TikTokAPPCrawler()
async def hybrid_parsing_single_video(self, url: str, minimal: bool = False):
# 解析抖音视频/Parse Douyin video
if "douyin" in url:
platform = "douyin"
aweme_id = await self.DouyinWebCrawler.get_aweme_id(url)
data = await self.DouyinWebCrawler.fetch_one_video(aweme_id)
data = data.get("aweme_detail")
# $.aweme_detail.aweme_type
aweme_type = data.get("aweme_type")
# 解析TikTok视频/Parse TikTok video
elif "tiktok" in url:
platform = "tiktok"
aweme_id = await self.TikTokWebCrawler.get_aweme_id(url)
data = await self.TikTokAPPCrawler.fetch_one_video(aweme_id)
# $.aweme_type
aweme_type = data.get("aweme_type")
else:
raise ValueError("hybrid_parsing_single_video: Cannot judge the video source from the URL.")
# 检查是否需要返回最小数据/Check if minimal data is required
if not minimal:
return data
# 如果是最小数据,处理数据/If it is minimal data, process the data
url_type_code_dict = {
# common
0: 'video',
# Douyin
2: 'image',
4: 'video',
68: 'image',
# TikTok
51: 'video',
55: 'video',
58: 'video',
61: 'video',
150: 'image'
}
# 判断链接类型/Judge link type
url_type = url_type_code_dict.get(aweme_type, 'video')
"""
以下为(视频||图片)数据处理的四个方法,如果你需要自定义数据处理请在这里修改.
The following are four methods of (video || image) data processing.
If you need to customize data processing, please modify it here.
"""
"""
创建已知数据字典(索引相同)稍后使用.update()方法更新数据
Create a known data dictionary (index the same),
and then use the .update() method to update the data
"""
result_data = {
'type': url_type,
'platform': platform,
'aweme_id': aweme_id,
'desc': data.get("desc"),
'create_time': data.get("create_time"),
'author': data.get("author"),
'music': data.get("music"),
'statistics': data.get("statistics"),
'cover_data': {
'cover': data.get("video").get("cover"),
'origin_cover': data.get("video").get("origin_cover"),
'dynamic_cover': data.get("video").get("dynamic_cover")
},
'hashtags': data.get('text_extra'),
}
# 创建一个空变量,稍后使用.update()方法更新数据/Create an empty variable and use the .update() method to update the data
api_data = None
# 判断链接类型并处理数据/Judge link type and process data
# 抖音数据处理/Douyin data processing
if platform == 'douyin':
# 抖音视频数据处理/Douyin video data processing
if url_type == 'video':
# 将信息储存在字典中/Store information in a dictionary
uri = data['video']['play_addr']['uri']
wm_video_url_HQ = data['video']['play_addr']['url_list'][0]
wm_video_url = f"https://aweme.snssdk.com/aweme/v1/playwm/?video_id={uri}&radio=1080p&line=0"
nwm_video_url_HQ = wm_video_url_HQ.replace('playwm', 'play')
nwm_video_url = f"https://aweme.snssdk.com/aweme/v1/play/?video_id={uri}&ratio=1080p&line=0"
api_data = {
'video_data':
{
'wm_video_url': wm_video_url,
'wm_video_url_HQ': wm_video_url_HQ,
'nwm_video_url': nwm_video_url,
'nwm_video_url_HQ': nwm_video_url_HQ
}
}
# 抖音图片数据处理/Douyin image data processing
elif url_type == 'image':
# 无水印图片列表/No watermark image list
no_watermark_image_list = []
# 有水印图片列表/With watermark image list
watermark_image_list = []
# 遍历图片列表/Traverse image list
for i in data['images']:
no_watermark_image_list.append(i['url_list'][0])
watermark_image_list.append(i['download_url_list'][0])
api_data = {
'image_data':
{
'no_watermark_image_list': no_watermark_image_list,
'watermark_image_list': watermark_image_list
}
}
# TikTok数据处理/TikTok data processing
elif platform == 'tiktok':
# TikTok视频数据处理/TikTok video data processing
if url_type == 'video':
# 将信息储存在字典中/Store information in a dictionary
wm_video = data['video']['download_addr']['url_list'][0]
api_data = {
'video_data':
{
'wm_video_url': wm_video,
'wm_video_url_HQ': wm_video,
'nwm_video_url': data['video']['play_addr']['url_list'][0],
'nwm_video_url_HQ': data['video']['bit_rate'][0]['play_addr']['url_list'][0]
}
}
# TikTok图片数据处理/TikTok image data processing
elif url_type == 'image':
# 无水印图片列表/No watermark image list
no_watermark_image_list = []
# 有水印图片列表/With watermark image list
watermark_image_list = []
for i in data['image_post_info']['images']:
no_watermark_image_list.append(i['display_image']['url_list'][0])
watermark_image_list.append(i['owner_watermark_image']['url_list'][0])
api_data = {
'image_data':
{
'no_watermark_image_list': no_watermark_image_list,
'watermark_image_list': watermark_image_list
}
}
# 更新数据/Update data
result_data.update(api_data)
return result_data
async def main(self):
# 测试混合解析单一视频接口/Test hybrid parsing single video endpoint
# url = "https://v.douyin.com/L4FJNR3/"
url = "https://www.tiktok.com/@evil0ctal/video/7156033831819037994"
minimal = True
result = await self.hybrid_parsing_single_video(url, minimal=minimal)
print(result)
# 占位
pass
if __name__ == '__main__':
# 实例化混合爬虫/Instantiate hybrid crawler
hybird_crawler = HybridCrawler()
# 运行测试代码/Run test code
asyncio.run(hybird_crawler.main())

View File

@ -0,0 +1,115 @@
# ==============================================================================
# Copyright (C) 2021 Evil0ctal
#
# This file is part of the Douyin_TikTok_Download_API project.
#
# This project is licensed under the Apache License 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at:
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
#         __
#        />  フ
#       |  _  _ l
#       ` ミ_x
#      /      | Feed me Stars ⭐
#     /  ヽ   ノ
#     │  | | |
#  / ̄|   | | |
#  | ( ̄ヽ__ヽ_)__)
#  \二つ
# ==============================================================================
#
# Contributor Link:
# - https://github.com/Evil0ctal
# - https://github.com/Johnserf-Seed
#
# ==============================================================================
import asyncio # 异步I/O
import time # 时间操作
import yaml # 配置文件
import os # 系统操作
# 基础爬虫客户端和TikTokAPI端点
from crawlers.base_crawler import BaseCrawler
from crawlers.tiktok.app.endpoints import TikTokAPIEndpoints
from crawlers.utils.utils import model_to_query_string
# TikTok接口数据请求模型
from crawlers.tiktok.app.models import (
BaseRequestModel, FeedVideoDetail
)
# 配置文件路径
path = os.path.abspath(os.path.dirname(__file__))
# 读取配置文件
with open(f"{path}/config.yaml", "r", encoding="utf-8") as f:
config = yaml.safe_load(f)
class TikTokAPPCrawler:
# 从配置文件中获取TikTok的请求头
async def get_tiktok_headers(self):
tiktok_config = config["TokenManager"]["tiktok"]
kwargs = {
"headers": {
"User-Agent": tiktok_config["headers"]["User-Agent"],
"Referer": tiktok_config["headers"]["Referer"],
"Cookie": tiktok_config["headers"]["Cookie"],
},
"proxies": {"http://": None, "https://": None},
}
return kwargs
"""-------------------------------------------------------handler接口列表-------------------------------------------------------"""
# 获取单个作品数据
async def fetch_one_video(self, aweme_id: str):
# 获取TikTok的实时Cookie
kwargs = await self.get_tiktok_headers()
params = FeedVideoDetail(aweme_id=aweme_id)
param_str = model_to_query_string(params)
url = f"{TikTokAPIEndpoints.HOME_FEED}?{param_str}"
# 创建一个基础爬虫
base_crawler = BaseCrawler(proxies=kwargs["proxies"], crawler_headers=kwargs["headers"])
async with base_crawler as crawler:
response = await crawler.fetch_get_json(url)
response = response.get("aweme_list")[0]
if response.get("aweme_id") != aweme_id:
raise Exception("作品ID错误/Video ID error")
return response
"""-------------------------------------------------------main------------------------------------------------------"""
async def main(self):
# 获取单个作品数据/Fetch single post data
aweme_id = "7339393672959757570"
response = await self.fetch_one_video(aweme_id)
print(response)
# 占位
pass
if __name__ == "__main__":
# 初始化
TikTokAPPCrawler = TikTokAPPCrawler()
# 开始时间
start = time.time()
asyncio.run(TikTokAPPCrawler.main())
# 结束时间
end = time.time()
print(f"耗时:{end - start}")

View File

@ -0,0 +1,10 @@
TokenManager:
tiktok:
headers:
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.212 Safari/537.36
Referer: https://www.tiktok.com/
Cookie: CykaBlyat=XD
proxies:
http:
https:

View File

@ -0,0 +1,10 @@
class TikTokAPIEndpoints:
"""
API Endpoints for TikTok APP
"""
# Tiktok域名 (Tiktok Domain)
TIKTOK_DOMAIN = "https://api22-normal-c-alisg.tiktokv.com"
# 视频主页Feed (Home Feed)
HOME_FEED = f"{TIKTOK_DOMAIN}/aweme/v1/feed/"

View File

@ -0,0 +1,27 @@
import time
from typing import List
from pydantic import BaseModel
# API基础请求模型/Base Request Model
class BaseRequestModel(BaseModel):
"""
Base Request Model for TikTok API
"""
iid: int = 7318518857994389254
device_id: int = 7318517321748022790
channel: str = "googleplay"
app_name: str = "musical_ly"
version_code: str = "300904"
device_platform: str = "android"
device_type: str = "SM-ASUS_Z01QD"
os_version: str = "9"
# Feed视频详情请求模型/Feed Video Detail Request Model
class FeedVideoDetail(BaseRequestModel):
"""
Feed Video Detail Request Model
"""
aweme_id: str

View File

@ -0,0 +1,26 @@
TokenManager:
tiktok:
headers:
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.212 Safari/537.36
Referer: https://www.tiktok.com/
Cookie: tt_csrf_token=swzFbctS-d01xfNEDqKXaPXo2rCgRD9G0eGE; ak_bmsc=FCF8615A4E6BE10A1DD7459465B8BB16~000000000000000000000000000000~YAAQ5OIsF7olH8+OAQAAzCrl+Bd8Fp6TmQBkH2R4pvEPzVfNCvW23aFkf/Y0mVUFcZKNWkK3HNchrpk8cOzoB+oPTpg2cdLX3jXanz4KPVPEL/j/63yZ2/dMIY85Szp9osCX08pO2uUNQ2+wrynmtOO17GbOiEwz88WU+vwmbiLHh3kXYgL05mhk7QPcCbkwmrjrO8MK6eIv61PY2w9hCySeBpDjEymEZaE1//YthsBzKjiin5XEcy02P8Gyjm4JPwCT51ibYMs+MInRSeGeqDe50R14dnnMEplkXpEyvgBc56jfvIC2hLFwodY3/WST+xJBJ/wlXnPQtz4xm2qxPe6GybqqND0ebbC50QEORTU0RVTNcIZLptBmXaMCq3fTUzQakV0Enyae8w==; tt_chain_token=Hd8IafgFe6cvSObNkafaVA==; bm_sv=642D5E006DE69E254946C4A763D9A4FD~YAAQ5OIsF7slH8+OAQAA/ivl+Bc2q1mDEH+wpeYc83apdrOw3IjVn0joPM3aB0QNcF10Gz972bawwuIGWkZiGdNzpshINZ0pD2Pv3Z0dylnaTc4uVwwcjIT7JJyJIxJFs8pN7zmUM4mSj+F6QbgpX9eOO/WemqDW25TUhrS6kLm886MiRaejnoCUcAt4Gt8FGmmpdNBKTLY0EqsLeWhU5++CqzCmJ/CWT9bUrFC3hI2yf54OEru5yP28bLKYPLmj~1; tiktok_webapp_theme=light; ttwid=1%7CV7vzUo1jsQihIMKHnZpNNp9rCrGHcC4bUw4VWm5SSOQ%7C1713572754%7Ca28c6b6eb3c4b597a1111b123a5b45e305626f37f249e9bc56d02bb6be651909; passport_csrf_token=8ed81b4c85a7c20332ec9265d7057fc1; passport_csrf_token_default=8ed81b4c85a7c20332ec9265d7057fc1; perf_feed_cache={%22expireTimestamp%22:1713744000000%2C%22itemIds%22:[%227350454864071150890%22%2C%227342421363656789291%22%2C%227341080338803821866%22]}; msToken=kd_x8QL-A4Z1FfvqdPF--bXF06lUvRkB5Q41QEr-kXiaV30qUXhT9iy5cIiRggNDA1B5vyaPkwPKQwQX4oZHAW2A0C_u23wyNLjdW3p8WfKTfOSCoNLPP5NkM9LCBUXKXvoMIEVU_EvSN-A-bQ==; msToken=pJU9vxWWaHLsaRwF5O3JdwuldPan_W6Zx0TcL3wMMQE1Ni2B76ukBfVfHb_TKnwgAoKAYT7uXB5JiY1TG8bxEAPYJBrIs78KFI6ZU5l79GWsCZHXkDlB7tpMf_jfSAFOYjxBAFy_4XWaGVzBRg==
proxies:
http:
https:
msToken:
url: https://mssdk-sg.tiktok.com/web/common?msToken=1Ab-7YxR9lUHSem0PraI_XzdKmpHb6j50L8AaXLAd2aWTdoJCYLfX_67rVQFE4UwwHVHmyG_NfIipqrlLT3kCXps-5PYlNAqtdwEg7TrDyTAfCKyBrOLmhMUjB55oW8SPZ4_EkNxNFUdV7MquA==
magic: 538969122
version: 1
dataType: 8
strData: 3BvqYbNXLLOcZehvxZVbjpAu7vq82RoWmFSJHLFwzDwJIZevE0AeilQfP55LridxmdGGjknoksqIsLqlMHMif0IFK/Br7JWqxOHnYuMwVCnttFc0Y4MFvdVWM5FECiEulJC0Dc+eeVsNSrFnAc9K7fazqdglyJgGLSfXIJmgyCvvQ4pg0u5HBVVugLSWs242X42fjoWymaUCLZJQo6vi6WLyuV7l5IC3Mg+lelr5xBQD6Q7hBIFEw8zzxJ1n2DyA4xLbOHTQdKvEtsK7XzyWwjpRnojPTbBl69Zosnuru+lOBIl+tFu/+hCQ1m0jYZwTP4rVE75L3Du6+KZ5v/9TyFYjq7y3y9bGLP4d7yQueJbF90G1yrZ6htElrZ2vqZKDrIqBVbmOZr/nph12k2JKrITtN0R/pMsp0sJ4gesQnXxcD/pLOFAINHk7umgbe6LzJ7+TLUdGuO4M7xiEg/jCqhjgJX1izZ4NPoBDp35zRxj6Y6OrcstlTN/cv5sz663+Nco/mEwhGq2VwrL4gAIAPycndIsb48dPdtngmLqNDNN0ZyVRjgqVIDXXrxigXCkR9CH89Dlrrb7QQqWVgRXz9/k5ihEM43BR3sd3mMU/XgFLN1Aoxf6GzzdxP2QPBI75/ZoHoAmu54v8gTmA3ntCGlEF0zgaFGTdpkGdb+oZgyQM4pw1aAyxmFINXkpD3IKKoGev9kD9gTFnhiQMGCMemhZS7ZYdbuGu0Cb+lQKaL/QTt80FMyGmW8kzVy9xW/ja9BcdEJYRoaufuFRkBFG5ay8x4WHLR6hEapXqQial/cREbLL4sQytpjtmnndFqvT7xN5DhgsLY2Z7451MJhD6NJXKNrMafGZSbItzQWY=
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36
ttwid:
url: https://www.tiktok.com/ttwid/check/
data: '{"aid":1988,"service":"www.tiktok.com","union":false,"unionHost":"","needFid":false,"fid":"","migrate_priority":0}'
cookie: ttwid=1%7CovVQu2St-HXSHAdEfZ7tljPe151SZ88AbrlTirlaC6w%7C1701072604%7C49b17849da69bafc3638e794f3f26b30fe9677c5253e65a2a5f615489846ce02
odin_tt:
url: https://www.tiktok.com/passport/web/account/info/?aid=1459&app_language=zh-Hans&app_name=tiktok_web&browser_language=zh-CN&browser_name=Mozilla&browser_online=true&browser_platform=Win32&browser_version=5.0%20%28Windows%20NT%2010.0%3B%20Win64%3B%20x64%29%20AppleWebKit%2F537.36%20%28KHTML%2C%20like%20Gecko%29%20Chrome%2F119.0.0.0%20Safari%2F537.36&channel=tiktok_web&cookie_enabled=true&device_id=7306060721837852167&root_referer=https%3A%2F%2Fwww.tiktok.com%2Flogin%2F

View File

@ -0,0 +1,52 @@
class TikTokAPIEndpoints:
"""
API Endpoints for TikTok
"""
# 抖音域名 (Tiktok Domain)
TIKTOK_DOMAIN = "https://www.tiktok.com"
# 直播域名 (Webcast Domain)
WEBCAST_DOMAIN = "https://webcast.tiktok.com"
# 登录 (Login)
LOGIN_ENDPOINT = f"{TIKTOK_DOMAIN}/login/"
# 首页推荐 (Home Recommend)
HOME_RECOMMEND = f"{TIKTOK_DOMAIN}/api/recommend/item_list/"
# 用户详细信息 (User Detail Info)
USER_DETAIL = f"{TIKTOK_DOMAIN}/api/user/detail/"
# 用户作品 (User Post)
USER_POST = f"{TIKTOK_DOMAIN}/api/post/item_list/"
# 用户点赞 (User Like)
USER_LIKE = f"{TIKTOK_DOMAIN}/api/favorite/item_list/"
# 用户收藏 (User Collect)
USER_COLLECT = f"{TIKTOK_DOMAIN}/api/user/collect/item_list/"
# 用户播放列表 (User Play List)
USER_PLAY_LIST = f"{TIKTOK_DOMAIN}/api/user/playlist/"
# 用户合辑 (User Mix)
USER_MIX = f"{TIKTOK_DOMAIN}/api/mix/item_list/"
# 猜你喜欢 (Guess You Like)
GUESS_YOU_LIKE = f"{TIKTOK_DOMAIN}/api/related/item_list/"
# 用户关注 (User Follow)
USER_FOLLOW = f"{TIKTOK_DOMAIN}/api/user/list/"
# 用户粉丝 (User Fans)
USER_FANS = f"{TIKTOK_DOMAIN}/api/user/list/"
# 作品信息 (Post Detail)
POST_DETAIL = f"{TIKTOK_DOMAIN}/api/item/detail/"
# 作品评论 (Post Comment)
POST_COMMENT = f"{TIKTOK_DOMAIN}/api/comment/list/"
# 作品评论回复 (Post Comment Reply)
POST_COMMENT_REPLY = f"{TIKTOK_DOMAIN}/api/comment/list/reply/"

View File

@ -0,0 +1,121 @@
from typing import Any
from pydantic import BaseModel
from urllib.parse import quote, unquote
from crawlers.tiktok.web.utils import TokenManager
from crawlers.utils.utils import get_timestamp
# Model
class BaseRequestModel(BaseModel):
WebIdLastTime: str = str(get_timestamp("sec"))
aid: str = "1988"
app_language: str = "en"
app_name: str = "tiktok_web"
browser_language: str = "en-US"
browser_name: str = "Mozilla"
browser_online: str = "true"
browser_platform: str = "Win32"
browser_version: str = quote(
"5.0 (Windows)",
safe="",
)
channel: str = "tiktok_web"
cookie_enabled: str = "true"
device_id: int = 7349090360347690538
device_platform: str = "web_pc"
focus_state: str = "true"
from_page: str = "user"
history_len: int = 4
is_fullscreen: str = "false"
is_page_visible: str = "true"
language: str = "en"
os: str = "windows"
priority_region: str = "US"
referer: str = ""
region: str = "US" # SG JP KR...
root_referer: str = quote("https://www.tiktok.com/", safe="")
screen_height: int = 1080
screen_width: int = 1920
webcast_language: str = "en"
tz_name: str = quote("America/Tijuana", safe="")
# verifyFp: str = VerifyFpManager.gen_verify_fp()
msToken: str = TokenManager.gen_real_msToken()
# router model
class UserProfile(BaseRequestModel):
secUid: str = ""
uniqueId: str
class UserPost(BaseRequestModel):
coverFormat: int = 2
count: int = 35
cursor: int = 0
secUid: str
class UserLike(BaseRequestModel):
coverFormat: int = 2
count: int = 30
cursor: int = 0
secUid: str
class UserCollect(BaseRequestModel):
cookie: str = ""
coverFormat: int = 2
count: int = 30
cursor: int = 0
secUid: str
class UserPlayList(BaseRequestModel):
count: int = 30
cursor: int = 0
secUid: str
class UserMix(BaseRequestModel):
count: int = 30
cursor: int = 0
mixId: str
class PostDetail(BaseRequestModel):
itemId: str
class PostComment(BaseRequestModel):
aweme_id: str
count: int = 20
cursor: int = 0
current_region: str = "US"
# 作品评论回复 (Post Comment Reply)
class PostCommentReply(BaseRequestModel):
item_id: str
comment_id: str
count: int = 20
cursor: int = 0
current_region: str = "US"
# 用户粉丝 (User Fans)
class UserFans(BaseRequestModel):
secUid: str
count: int = 30
maxCursor: int = 0
minCursor: int = 0
scene: int = 67
# 用户关注 (User Follow)
class UserFollow(BaseRequestModel):
secUid: str
count: int = 30
maxCursor: int = 0
minCursor: int = 0
scene: int = 21

View File

@ -0,0 +1,666 @@
import os
import re
import json
import yaml
import httpx
import asyncio
from typing import Union
from pathlib import Path
from crawlers.utils.logger import logger
from crawlers.douyin.web.xbogus import XBogus as XB
from crawlers.utils.utils import (
gen_random_str,
get_timestamp,
extract_valid_urls,
split_filename,
)
from crawlers.utils.api_exceptions import (
APIError,
APIConnectionError,
APIResponseError,
APIUnauthorizedError,
APINotFoundError,
)
# 配置文件路径
# Read the configuration file
path = os.path.abspath(os.path.dirname(__file__))
# 读取配置文件
with open(f"{path}/config.yaml", "r", encoding="utf-8") as f:
config = yaml.safe_load(f)
class TokenManager:
tiktok_manager = config.get("TokenManager").get("tiktok")
token_conf = tiktok_manager.get("msToken", None)
ttwid_conf = tiktok_manager.get("ttwid", None)
odin_tt_conf = tiktok_manager.get("odin_tt", None)
proxies_conf = tiktok_manager.get("proxies", None)
proxies = {
"http://": proxies_conf.get("http", None),
"https://": proxies_conf.get("https", None),
}
@classmethod
def gen_real_msToken(cls) -> str:
"""
生成真实的msToken,当出现错误时返回虚假的值
(Generate a real msToken and return a false value when an error occurs)
"""
payload = json.dumps(
{
"magic": cls.token_conf["magic"],
"version": cls.token_conf["version"],
"dataType": cls.token_conf["dataType"],
"strData": cls.token_conf["strData"],
"tspFromClient": get_timestamp(),
}
)
headers = {
"User-Agent": cls.token_conf["User-Agent"],
"Content-Type": "application/json",
}
transport = httpx.HTTPTransport(retries=5)
with httpx.Client(transport=transport, proxies=cls.proxies) as client:
try:
response = client.post(
cls.token_conf["url"], headers=headers, content=payload
)
response.raise_for_status()
msToken = str(httpx.Cookies(response.cookies).get("msToken"))
if len(msToken) not in [148]:
raise APIResponseError("{0} 内容不符合要求".format("msToken"))
return msToken
except httpx.RequestError as exc:
# 捕获所有与 httpx 请求相关的异常情况 (Captures all httpx request-related exceptions)
raise APIConnectionError("请求端点失败,请检查当前网络环境。 链接:{0},代理:{1},异常类名:{2},异常详细信息:{3}"
.format(cls.token_conf["url"], cls.proxies, cls.__name__, exc)
)
except httpx.HTTPStatusError as e:
# 捕获 httpx 的状态代码错误 (captures specific status code errors from httpx)
if response.status_code == 401:
raise APIUnauthorizedError("参数验证失败,请更新 F2 配置文件中的 {0},以匹配 {1} 新规则"
.format("msToken", "tiktok")
)
elif response.status_code == 404:
raise APINotFoundError("{0} 无法找到API端点".format("msToken"))
else:
raise APIResponseError("链接:{0},状态码 {1}{2} ".format(
e.response.url, e.response.status_code, e.response.text
)
)
except APIError as e:
# 返回虚假的msToken (Return a fake msToken)
logger.error("msToken API错误{0}".format(e))
logger.info("生成虚假的msToken")
return cls.gen_false_msToken()
@classmethod
def gen_false_msToken(cls) -> str:
"""生成随机msToken (Generate random msToken)"""
return gen_random_str(146) + "=="
@classmethod
def gen_ttwid(cls, cookie: str) -> str:
"""
生成请求必带的ttwid (Generate the essential ttwid for requests)
"""
transport = httpx.HTTPTransport(retries=5)
with httpx.Client(transport=transport, proxies=cls.proxies) as client:
try:
response = client.post(
cls.ttwid_conf["url"],
content=cls.ttwid_conf["data"],
headers={
"Cookie": cookie,
"Content-Type": "text/plain",
},
)
response.raise_for_status()
ttwid = httpx.Cookies(response.cookies).get("ttwid")
if ttwid is None:
raise APIResponseError(
"ttwid: 检查没有通过, 请更新配置文件中的ttwid"
)
return ttwid
except httpx.RequestError as exc:
# 捕获所有与 httpx 请求相关的异常情况 (Captures all httpx request-related exceptions)
raise APIConnectionError("请求端点失败,请检查当前网络环境。 链接:{0},代理:{1},异常类名:{2},异常详细信息:{3}"
.format(cls.ttwid_conf["url"], cls.proxies, cls.__name__, exc)
)
except httpx.HTTPStatusError as e:
# 捕获 httpx 的状态代码错误 (captures specific status code errors from httpx)
if response.status_code == 401:
raise APIUnauthorizedError("参数验证失败,请更新 F2 配置文件中的 {0},以匹配 {1} 新规则"
.format("ttwid", "tiktok")
)
elif response.status_code == 404:
raise APINotFoundError("{0} 无法找到API端点".format("ttwid"))
else:
raise APIResponseError("链接:{0},状态码 {1}{2} ".format(
e.response.url, e.response.status_code, e.response.text
)
)
@classmethod
def gen_odin_tt(cls):
"""
生成请求必带的odin_tt (Generate the essential odin_tt for requests)
"""
transport = httpx.HTTPTransport(retries=5)
with httpx.Client(transport=transport, proxies=cls.proxies) as client:
try:
response = client.get(cls.odin_tt_conf["url"])
response.raise_for_status()
odin_tt = httpx.Cookies(response.cookies).get("odin_tt")
if odin_tt is None:
raise APIResponseError("{0} 内容不符合要求".format("odin_tt"))
return odin_tt
except httpx.RequestError as exc:
# 捕获所有与 httpx 请求相关的异常情况 (Captures all httpx request-related exceptions)
raise APIConnectionError("请求端点失败,请检查当前网络环境。 链接:{0},代理:{1},异常类名:{2},异常详细信息:{3}"
.format(cls.odin_tt_conf["url"], cls.proxies, cls.__name__, exc)
)
except httpx.HTTPStatusError as e:
# 捕获 httpx 的状态代码错误 (captures specific status code errors from httpx)
if response.status_code == 401:
raise APIUnauthorizedError("参数验证失败,请更新 F2 配置文件中的 {0},以匹配 {1} 新规则"
.format("odin_tt", "tiktok")
)
elif response.status_code == 404:
raise APINotFoundError("{0} 无法找到API端点".format("odin_tt"))
else:
raise APIResponseError("链接:{0},状态码 {1}{2} ".format(
e.response.url, e.response.status_code, e.response.text
)
)
class BogusManager:
@classmethod
def xb_str_2_endpoint(
cls,
user_agent: str,
endpoint: str,
) -> str:
try:
final_endpoint = XB(user_agent).getXBogus(endpoint)
except Exception as e:
raise RuntimeError("生成X-Bogus失败: {0})".format(e))
return final_endpoint[0]
@classmethod
def model_2_endpoint(
cls,
base_endpoint: str,
params: dict,
user_agent: str,
) -> str:
# 检查params是否是一个字典 (Check if params is a dict)
if not isinstance(params, dict):
raise TypeError("参数必须是字典类型")
param_str = "&".join([f"{k}={v}" for k, v in params.items()])
try:
xb_value = XB(user_agent).getXBogus(param_str)
except Exception as e:
raise RuntimeError("生成X-Bogus失败: {0})".format(e))
# 检查base_endpoint是否已有查询参数 (Check if base_endpoint already has query parameters)
separator = "&" if "?" in base_endpoint else "?"
final_endpoint = f"{base_endpoint}{separator}{param_str}&X-Bogus={xb_value[1]}"
return final_endpoint
class SecUserIdFetcher:
# 预编译正则表达式
_TIKTOK_SECUID_PARREN = re.compile(
r"<script id=\"__UNIVERSAL_DATA_FOR_REHYDRATION__\" type=\"application/json\">(.*?)</script>"
)
_TIKTOK_UNIQUEID_PARREN = re.compile(r"/@([^/?]*)")
_TIKTOK_NOTFOUND_PARREN = re.compile(r"notfound")
@classmethod
async def get_secuid(cls, url: str) -> str:
"""
获取TikTok用户sec_uid
Args:
url: 用户主页链接
Return:
sec_uid: 用户唯一标识
"""
# 进行参数检查
if not isinstance(url, str):
raise TypeError("输入参数必须是字符串")
# 提取有效URL
url = extract_valid_urls(url)
if url is None:
raise (
APINotFoundError("输入的URL不合法。类名{0}".format(cls.__name__))
)
transport = httpx.AsyncHTTPTransport(retries=5)
async with httpx.AsyncClient(
transport=transport, proxies=TokenManager.proxies, timeout=10
) as client:
try:
response = await client.get(url, follow_redirects=True)
# 444一般为Nginx拦截不返回状态 (444 is generally intercepted by Nginx and does not return status)
if response.status_code in {200, 444}:
if cls._TIKTOK_NOTFOUND_PARREN.search(str(response.url)):
raise APINotFoundError("页面不可用,可能是由于区域限制(代理)造成的。类名: {0}"
.format(cls.__name__)
)
match = cls._TIKTOK_SECUID_PARREN.search(str(response.text))
if not match:
raise APIResponseError("未在响应中找到 {0},检查链接是否为用户主页。类名: {1}"
.format("sec_uid", cls.__name__)
)
# 提取SIGI_STATE对象中的sec_uid
data = json.loads(match.group(1))
default_scope = data.get("__DEFAULT_SCOPE__", {})
user_detail = default_scope.get("webapp.user-detail", {})
user_info = user_detail.get("userInfo", {}).get("user", {})
sec_uid = user_info.get("secUid")
if sec_uid is None:
raise RuntimeError(
"获取 {0} 失败,{1}".format(sec_uid, user_info)
)
return sec_uid
else:
raise ConnectionError("接口状态码异常, 请检查重试")
except httpx.RequestError as exc:
# 捕获所有与 httpx 请求相关的异常情况 (Captures all httpx request-related exceptions)
raise APIConnectionError("请求端点失败,请检查当前网络环境。 链接:{0},代理:{1},异常类名:{2},异常详细信息:{3}"
.format(url, TokenManager.proxies, cls.__name__, exc)
)
@classmethod
async def get_all_secuid(cls, urls: list) -> list:
"""
获取列表secuid列表 (Get list sec_user_id list)
Args:
urls: list: 用户url列表 (User url list)
Return:
secuids: list: 用户secuid列表 (User secuid list)
"""
if not isinstance(urls, list):
raise TypeError("参数必须是列表类型")
# 提取有效URL
urls = extract_valid_urls(urls)
if urls == []:
raise (
APINotFoundError(
"输入的URL List不合法。类名{0}".format(cls.__name__)
)
)
secuids = [cls.get_secuid(url) for url in urls]
return await asyncio.gather(*secuids)
@classmethod
async def get_uniqueid(cls, url: str) -> str:
"""
获取TikTok用户unique_id
Args:
url: 用户主页链接
Return:
unique_id: 用户唯一id
"""
# 进行参数检查
if not isinstance(url, str):
raise TypeError("输入参数必须是字符串")
# 提取有效URL
url = extract_valid_urls(url)
if url is None:
raise (
APINotFoundError("输入的URL不合法。类名{0}".format(cls.__name__))
)
transport = httpx.AsyncHTTPTransport(retries=5)
async with httpx.AsyncClient(
transport=transport, proxies=TokenManager.proxies, timeout=10
) as client:
try:
response = await client.get(url, follow_redirects=True)
if response.status_code in {200, 444}:
if cls._TIKTOK_NOTFOUND_PARREN.search(str(response.url)):
raise APINotFoundError("页面不可用,可能是由于区域限制(代理)造成的。类名: {0}"
.format(cls.__name__)
)
match = cls._TIKTOK_UNIQUEID_PARREN.search(str(response.url))
if not match:
raise APIResponseError(
"未在响应中找到 {0}".format("unique_id")
)
unique_id = match.group(1)
if unique_id is None:
raise RuntimeError(
"获取 {0} 失败,{1}".format("unique_id", response.url)
)
return unique_id
else:
raise ConnectionError(
"接口状态码异常 {0}, 请检查重试".format(response.status_code)
)
except httpx.RequestError:
raise APIConnectionError("连接端点失败,检查网络环境或代理:{0} 代理:{1} 类名:{2}"
.format(url, TokenManager.proxies, cls.__name__),
)
@classmethod
async def get_all_uniqueid(cls, urls: list) -> list:
"""
获取列表unique_id列表 (Get list sec_user_id list)
Args:
urls: list: 用户url列表 (User url list)
Return:
unique_ids: list: 用户unique_id列表 (User unique_id list)
"""
if not isinstance(urls, list):
raise TypeError("参数必须是列表类型")
# 提取有效URL
urls = extract_valid_urls(urls)
if urls == []:
raise (
APINotFoundError(
"输入的URL List不合法。类名{0}".format(cls.__name__)
)
)
unique_ids = [cls.get_uniqueid(url) for url in urls]
return await asyncio.gather(*unique_ids)
class AwemeIdFetcher:
# https://www.tiktok.com/@scarlettjonesuk/video/7255716763118226715
# https://www.tiktok.com/@scarlettjonesuk/video/7255716763118226715?is_from_webapp=1&sender_device=pc&web_id=7306060721837852167
# 预编译正则表达式
_TIKTOK_AWEMEID_PARREN = re.compile(r"video/(\d*)")
_TIKTOK_NOTFOUND_PARREN = re.compile(r"notfound")
@classmethod
async def get_aweme_id(cls, url: str) -> str:
"""
获取TikTok作品aweme_id
Args:
url: 作品链接
Return:
aweme_id: 作品唯一标识
"""
# 进行参数检查
if not isinstance(url, str):
raise TypeError("输入参数必须是字符串")
# 提取有效URL
url = extract_valid_urls(url)
if url is None:
raise (
APINotFoundError("输入的URL不合法。类名{0}".format(cls.__name__))
)
transport = httpx.AsyncHTTPTransport(retries=5)
async with httpx.AsyncClient(
transport=transport, proxies=TokenManager.proxies, timeout=10
) as client:
try:
response = await client.get(url, follow_redirects=True)
if response.status_code in {200, 444}:
if cls._TIKTOK_NOTFOUND_PARREN.search(str(response.url)):
raise APINotFoundError("页面不可用,可能是由于区域限制(代理)造成的。类名: {0}"
.format(cls.__name__)
)
match = cls._TIKTOK_AWEMEID_PARREN.search(str(response.url))
if not match:
raise APIResponseError(
"未在响应中找到 {0}".format("aweme_id")
)
aweme_id = match.group(1)
if aweme_id is None:
raise RuntimeError(
"获取 {0} 失败,{1}".format("aweme_id", response.url)
)
return aweme_id
else:
raise ConnectionError(
"接口状态码异常 {0},请检查重试".format(response.status_code)
)
except httpx.RequestError as exc:
# 捕获所有与 httpx 请求相关的异常情况 (Captures all httpx request-related exceptions)
raise APIConnectionError("请求端点失败,请检查当前网络环境。 链接:{0},代理:{1},异常类名:{2},异常详细信息:{3}"
.format(url, TokenManager.proxies, cls.__name__, exc)
)
@classmethod
async def get_all_aweme_id(cls, urls: list) -> list:
"""
获取视频aweme_id,传入列表url都可以解析出aweme_id (Get video aweme_id, pass in the list url can parse out aweme_id)
Args:
urls: list: 列表url (list url)
Return:
aweme_ids: list: 视频的唯一标识返回列表 (The unique identifier of the video, return list)
"""
if not isinstance(urls, list):
raise TypeError("参数必须是列表类型")
# 提取有效URL
urls = extract_valid_urls(urls)
if urls == []:
raise (
APINotFoundError(
"输入的URL List不合法。类名{0}".format(cls.__name__)
)
)
aweme_ids = [cls.get_aweme_id(url) for url in urls]
return await asyncio.gather(*aweme_ids)
def format_file_name(
naming_template: str,
aweme_data: dict = {},
custom_fields: dict = {},
) -> str:
"""
根据配置文件的全局格式化文件名
(Format file name according to the global conf file)
Args:
aweme_data (dict): 抖音数据的字典 (dict of douyin data)
naming_template (str): 文件的命名模板, "{create}_{desc}" (Naming template for files, such as "{create}_{desc}")
custom_fields (dict): 用户自定义字段, 用于替代默认的字段值 (Custom fields for replacing default field values)
Note:
windows 文件名长度限制为 255 个字符, 开启了长文件名支持后为 32,767 个字符
(Windows file name length limit is 255 characters, 32,767 characters after long file name support is enabled)
Unix 文件名长度限制为 255 个字符
(Unix file name length limit is 255 characters)
取去除后的50个字符, 加上后缀, 一般不会超过255个字符
(Take the removed 50 characters, add the suffix, and generally not exceed 255 characters)
详细信息请参考: https://en.wikipedia.org/wiki/Filename#Length
(For more information, please refer to: https://en.wikipedia.org/wiki/Filename#Length)
Returns:
str: 格式化的文件名 (Formatted file name)
"""
# 为不同系统设置不同的文件名长度限制
os_limit = {
"win32": 200,
"cygwin": 60,
"darwin": 60,
"linux": 60,
}
fields = {
"create": aweme_data.get("createTime", ""), # 长度固定19
"nickname": aweme_data.get("nickname", ""), # 最长30
"aweme_id": aweme_data.get("aweme_id", ""), # 长度固定19
"desc": split_filename(aweme_data.get("desc", ""), os_limit),
"uid": aweme_data.get("uid", ""), # 固定11
}
if custom_fields:
# 更新自定义字段
fields.update(custom_fields)
try:
return naming_template.format(**fields)
except KeyError as e:
raise KeyError("文件名模板字段 {0} 不存在,请检查".format(e))
def create_user_folder(kwargs: dict, nickname: Union[str, int]) -> Path:
"""
根据提供的配置文件和昵称创建对应的保存目录
(Create the corresponding save directory according to the provided conf file and nickname.)
Args:
kwargs (dict): 配置文件字典格式(Conf file, dict format)
nickname (Union[str, int]): 用户的昵称允许字符串或整数 (User nickname, allow strings or integers)
Note:
如果未在配置文件中指定路径则默认为 "Download"
(If the path is not specified in the conf file, it defaults to "Download".)
仅支持相对路径
(Only relative paths are supported.)
Raises:
TypeError: 如果 kwargs 不是字典格式将引发 TypeError
(If kwargs is not in dict format, TypeError will be raised.)
"""
# 确定函数参数是否正确
if not isinstance(kwargs, dict):
raise TypeError("kwargs 参数必须是字典")
# 创建基础路径
base_path = Path(kwargs.get("path", "Download"))
# 添加下载模式和用户名
user_path = (
base_path / "tiktok" / kwargs.get("mode", "PLEASE_SETUP_MODE") / str(nickname)
)
# 获取绝对路径并确保它存在
resolve_user_path = user_path.resolve()
# 创建目录
resolve_user_path.mkdir(parents=True, exist_ok=True)
return resolve_user_path
def rename_user_folder(old_path: Path, new_nickname: str) -> Path:
"""
重命名用户目录 (Rename User Folder).
Args:
old_path (Path): 旧的用户目录路径 (Path of the old user folder)
new_nickname (str): 新的用户昵称 (New user nickname)
Returns:
Path: 重命名后的用户目录路径 (Path of the renamed user folder)
"""
# 获取目标目录的父目录 (Get the parent directory of the target folder)
parent_directory = old_path.parent
# 构建新目录路径 (Construct the new directory path)
new_path = old_path.rename(parent_directory / new_nickname).resolve()
return new_path
def create_or_rename_user_folder(
kwargs: dict, local_user_data: dict, current_nickname: str
) -> Path:
"""
创建或重命名用户目录 (Create or rename user directory)
Args:
kwargs (dict): 配置参数 (Conf parameters)
local_user_data (dict): 本地用户数据 (Local user data)
current_nickname (str): 当前用户昵称 (Current user nickname)
Returns:
user_path (Path): 用户目录路径 (User directory path)
"""
user_path = create_user_folder(kwargs, current_nickname)
if not local_user_data:
return user_path
if local_user_data.get("nickname") != current_nickname:
# 昵称不一致,触发目录更新操作
user_path = rename_user_folder(user_path, current_nickname)
return user_path

View File

@ -0,0 +1,490 @@
# ==============================================================================
# Copyright (C) 2021 Evil0ctal
#
# This file is part of the Douyin_TikTok_Download_API project.
#
# This project is licensed under the Apache License 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at:
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
#         __
#        />  フ
#       |  _  _ l
#       ` ミ_x
#      /      | Feed me Stars ⭐
#     /  ヽ   ノ
#     │  | | |
#  / ̄|   | | |
#  | ( ̄ヽ__ヽ_)__)
#  \二つ
# ==============================================================================
#
# Contributor Link:
# - https://github.com/Evil0ctal
# - https://github.com/Johnserf-Seed
#
# ==============================================================================
import asyncio # 异步I/O
import time # 时间操作
import yaml # 配置文件
import os # 系统操作
# 基础爬虫客户端和TikTokAPI端点
from crawlers.base_crawler import BaseCrawler
from crawlers.tiktok.web.endpoints import TikTokAPIEndpoints
from crawlers.utils.utils import extract_valid_urls
# TikTok加密参数生成器
from crawlers.tiktok.web.utils import (
AwemeIdFetcher,
BogusManager,
SecUserIdFetcher,
TokenManager
)
# TikTok接口数据请求模型
from crawlers.tiktok.web.models import (
UserProfile,
UserPost,
UserLike,
UserMix,
UserCollect,
PostDetail,
UserPlayList,
PostComment,
PostCommentReply,
UserFans,
UserFollow
)
# 配置文件路径
path = os.path.abspath(os.path.dirname(__file__))
# 读取配置文件
with open(f"{path}/config.yaml", "r", encoding="utf-8") as f:
config = yaml.safe_load(f)
class TikTokWebCrawler:
def __init__(self):
self.proxy_pool = None
# 从配置文件中获取TikTok的请求头
async def get_tiktok_headers(self):
tiktok_config = config["TokenManager"]["tiktok"]
kwargs = {
"headers": {
"User-Agent": tiktok_config["headers"]["User-Agent"],
"Referer": tiktok_config["headers"]["Referer"],
"Cookie": tiktok_config["headers"]["Cookie"],
},
"proxies": {"http://": None, "https://": None},
}
return kwargs
"""-------------------------------------------------------handler接口列表-------------------------------------------------------"""
# 获取单个作品数据
async def fetch_one_video(self, itemId: str):
# 获取TikTok的实时Cookie
kwargs = await self.get_tiktok_headers()
# 创建一个基础爬虫
base_crawler = BaseCrawler(proxies=kwargs["proxies"], crawler_headers=kwargs["headers"])
async with base_crawler as crawler:
# 创建一个作品详情的BaseModel参数
params = PostDetail(itemId=itemId)
# 生成一个作品详情的带有加密参数的Endpoint
endpoint = BogusManager.model_2_endpoint(
TikTokAPIEndpoints.POST_DETAIL, params.dict(), kwargs["headers"]["User-Agent"]
)
response = await crawler.fetch_get_json(endpoint)
return response
# 获取用户的个人信息
async def fetch_user_profile(self, secUid: str, uniqueId: str):
# 获取TikTok的实时Cookie
kwargs = await self.get_tiktok_headers()
# 创建一个基础爬虫
base_crawler = BaseCrawler(proxies=kwargs["proxies"], crawler_headers=kwargs["headers"])
async with base_crawler as crawler:
# 创建一个用户详情的BaseModel参数
params = UserProfile(secUid=secUid, uniqueId=uniqueId)
# 生成一个用户详情的带有加密参数的Endpoint
endpoint = BogusManager.model_2_endpoint(
TikTokAPIEndpoints.USER_DETAIL, params.dict(), kwargs["headers"]["User-Agent"]
)
response = await crawler.fetch_get_json(endpoint)
return response
# 获取用户的作品列表
async def fetch_user_post(self, secUid: str, cursor: int = 0, count: int = 35, coverFormat: int = 2):
# 获取TikTok的实时Cookie
kwargs = await self.get_tiktok_headers()
# proxies = {"http://": 'http://43.159.29.191:24144', "https://": 'http://43.159.29.191:24144'}
# 创建一个基础爬虫
base_crawler = BaseCrawler(proxies=None, crawler_headers=kwargs["headers"])
async with base_crawler as crawler:
# 创建一个用户作品的BaseModel参数
params = UserPost(secUid=secUid, cursor=cursor, count=count, coverFormat=coverFormat)
# 生成一个用户作品的带有加密参数的Endpoint
endpoint = BogusManager.model_2_endpoint(
TikTokAPIEndpoints.USER_POST, params.dict(), kwargs["headers"]["User-Agent"]
)
response = await crawler.fetch_get_json(endpoint)
return response
# 获取用户的点赞列表
async def fetch_user_like(self, secUid: str, cursor: int = 0, count: int = 30, coverFormat: int = 2):
# 获取TikTok的实时Cookie
kwargs = await self.get_tiktok_headers()
# 创建一个基础爬虫
base_crawler = BaseCrawler(proxies=kwargs["proxies"], crawler_headers=kwargs["headers"])
async with base_crawler as crawler:
# 创建一个用户点赞的BaseModel参数
params = UserLike(secUid=secUid, cursor=cursor, count=count, coverFormat=coverFormat)
# 生成一个用户点赞的带有加密参数的Endpoint
endpoint = BogusManager.model_2_endpoint(
TikTokAPIEndpoints.USER_LIKE, params.dict(), kwargs["headers"]["User-Agent"]
)
response = await crawler.fetch_get_json(endpoint)
return response
# 获取用户的收藏列表
async def fetch_user_collect(self, cookie: str, secUid: str, cursor: int = 0, count: int = 30,
coverFormat: int = 2):
# 获取TikTok的实时Cookie
kwargs = await self.get_tiktok_headers()
kwargs["headers"]["Cookie"] = cookie
# 创建一个基础爬虫
base_crawler = BaseCrawler(proxies=kwargs["proxies"], crawler_headers=kwargs["headers"])
async with base_crawler as crawler:
# 创建一个用户收藏的BaseModel参数
params = UserCollect(cookie=cookie, secUid=secUid, cursor=cursor, count=count, coverFormat=coverFormat)
# 生成一个用户收藏的带有加密参数的Endpoint
endpoint = BogusManager.model_2_endpoint(
TikTokAPIEndpoints.USER_COLLECT, params.dict(), kwargs["headers"]["User-Agent"]
)
response = await crawler.fetch_get_json(endpoint)
return response
# 获取用户的播放列表
async def fetch_user_play_list(self, secUid: str, cursor: int = 0, count: int = 30):
# 获取TikTok的实时Cookie
kwargs = await self.get_tiktok_headers()
# 创建一个基础爬虫
base_crawler = BaseCrawler(proxies=kwargs["proxies"], crawler_headers=kwargs["headers"])
async with base_crawler as crawler:
# 创建一个用户播放列表的BaseModel参数
params = UserPlayList(secUid=secUid, cursor=cursor, count=count)
# 生成一个用户播放列表的带有加密参数的Endpoint
endpoint = BogusManager.model_2_endpoint(
TikTokAPIEndpoints.USER_PLAY_LIST, params.dict(), kwargs["headers"]["User-Agent"]
)
response = await crawler.fetch_get_json(endpoint)
return response
# 获取用户的合辑列表
async def fetch_user_mix(self, mixId: str, cursor: int = 0, count: int = 30):
# 获取TikTok的实时Cookie
kwargs = await self.get_tiktok_headers()
# 创建一个基础爬虫
base_crawler = BaseCrawler(proxies=kwargs["proxies"], crawler_headers=kwargs["headers"])
async with base_crawler as crawler:
# 创建一个用户合辑的BaseModel参数
params = UserMix(mixId=mixId, cursor=cursor, count=count)
# 生成一个用户合辑的带有加密参数的Endpoint
endpoint = BogusManager.model_2_endpoint(
TikTokAPIEndpoints.USER_MIX, params.dict(), kwargs["headers"]["User-Agent"]
)
response = await crawler.fetch_get_json(endpoint)
return response
# 获取作品的评论列表
async def fetch_post_comment(self, aweme_id: str, cursor: int = 0, count: int = 20, current_region: str = ""):
# 获取TikTok的实时Cookie
kwargs = await self.get_tiktok_headers()
# proxies = {"http://": 'http://43.159.18.174:25263', "https://": 'http://43.159.18.174:25263'}
# 创建一个基础爬虫
base_crawler = BaseCrawler(proxies=None, crawler_headers=kwargs["headers"])
async with base_crawler as crawler:
# 创建一个作品评论的BaseModel参数
params = PostComment(aweme_id=aweme_id, cursor=cursor, count=count, current_region=current_region)
# 生成一个作品评论的带有加密参数的Endpoint
endpoint = BogusManager.model_2_endpoint(
TikTokAPIEndpoints.POST_COMMENT, params.dict(), kwargs["headers"]["User-Agent"]
)
response = await crawler.fetch_get_json(endpoint)
return response
# 获取作品的评论回复列表
async def fetch_post_comment_reply(self, item_id: str, comment_id: str, cursor: int = 0, count: int = 20,
current_region: str = ""):
# 获取TikTok的实时Cookie
kwargs = await self.get_tiktok_headers()
# 创建一个基础爬虫
base_crawler = BaseCrawler(proxies=kwargs["proxies"], crawler_headers=kwargs["headers"])
async with base_crawler as crawler:
# 创建一个作品评论的BaseModel参数
params = PostCommentReply(item_id=item_id, comment_id=comment_id, cursor=cursor, count=count,
current_region=current_region)
# 生成一个作品评论的带有加密参数的Endpoint
endpoint = BogusManager.model_2_endpoint(
TikTokAPIEndpoints.POST_COMMENT_REPLY, params.dict(), kwargs["headers"]["User-Agent"]
)
response = await crawler.fetch_get_json(endpoint)
return response
# 获取用户的粉丝列表
async def fetch_user_fans(self, secUid: str, count: int = 30, maxCursor: int = 0, minCursor: int = 0):
# 获取TikTok的实时Cookie
kwargs = await self.get_tiktok_headers()
# 创建一个基础爬虫
base_crawler = BaseCrawler(proxies=kwargs["proxies"], crawler_headers=kwargs["headers"])
async with base_crawler as crawler:
# 创建一个用户关注的BaseModel参数
params = UserFans(secUid=secUid, count=count, maxCursor=maxCursor, minCursor=minCursor)
# 生成一个用户关注的带有加密参数的Endpoint
endpoint = BogusManager.model_2_endpoint(
TikTokAPIEndpoints.USER_FANS, params.dict(), kwargs["headers"]["User-Agent"]
)
response = await crawler.fetch_get_json(endpoint)
return response
# 获取用户的关注列表
async def fetch_user_follow(self, secUid: str, count: int = 30, maxCursor: int = 0, minCursor: int = 0):
# 获取TikTok的实时Cookie
kwargs = await self.get_tiktok_headers()
# 创建一个基础爬虫
base_crawler = BaseCrawler(proxies=kwargs["proxies"], crawler_headers=kwargs["headers"])
async with base_crawler as crawler:
# 创建一个用户关注的BaseModel参数
params = UserFollow(secUid=secUid, count=count, maxCursor=maxCursor, minCursor=minCursor)
# 生成一个用户关注的带有加密参数的Endpoint
endpoint = BogusManager.model_2_endpoint(
TikTokAPIEndpoints.USER_FOLLOW, params.dict(), kwargs["headers"]["User-Agent"]
)
response = await crawler.fetch_get_json(endpoint)
return response
"""-------------------------------------------------------utils接口列表-------------------------------------------------------"""
# 生成真实msToken
async def fetch_real_msToken(self):
result = {
"msToken": TokenManager().gen_real_msToken()
}
return result
# 生成ttwid
async def gen_ttwid(self, cookie: str):
result = {
"ttwid": TokenManager().gen_ttwid(cookie)
}
return result
# 生成xbogus
async def gen_xbogus(self, url: str, user_agent: str):
url = BogusManager.xb_str_2_endpoint(user_agent, url)
result = {
"url": url,
"x_bogus": url.split("&X-Bogus=")[1],
"user_agent": user_agent
}
return result
# 提取单个用户id
async def get_sec_user_id(self, url: str):
return await SecUserIdFetcher.get_secuid(url)
# 提取列表用户id
async def get_all_sec_user_id(self, urls: list):
# 提取有效URL
urls = extract_valid_urls(urls)
# 对于URL列表
return await SecUserIdFetcher.get_all_secuid(urls)
# 提取单个作品id
async def get_aweme_id(self, url: str):
return await AwemeIdFetcher.get_aweme_id(url)
# 提取列表作品id
async def get_all_aweme_id(self, urls: list):
# 提取有效URL
urls = extract_valid_urls(urls)
# 对于URL列表
return await AwemeIdFetcher.get_all_aweme_id(urls)
# 获取用户unique_id
async def get_unique_id(self, url: str):
return await SecUserIdFetcher.get_uniqueid(url)
# 获取列表unique_id列表
async def get_all_unique_id(self, urls: list):
# 提取有效URL
urls = extract_valid_urls(urls)
# 对于URL列表
return await SecUserIdFetcher.get_all_uniqueid(urls)
"""-------------------------------------------------------main接口列表-------------------------------------------------------"""
async def main(self):
# 获取单个作品数据
# item_id = "7339393672959757570"
# response = await self.fetch_one_video(item_id)
# print(response)
# 获取用户的个人信息
# secUid = "MS4wLjABAAAAfDPs6wbpBcMMb85xkvDGdyyyVAUS2YoVCT9P6WQ1bpuwEuPhL9eFtTmGvxw1lT2C"
# uniqueId = "c4shjaz"
# response = await self.fetch_user_profile(secUid, uniqueId)
# print(response)
# 获取用户的作品列表
# secUid = "MS4wLjABAAAAfDPs6wbpBcMMb85xkvDGdyyyVAUS2YoVCT9P6WQ1bpuwEuPhL9eFtTmGvxw1lT2C"
# cursor = 0
# count = 35
# coverFormat = 2
# response = await self.fetch_user_post(secUid, cursor, count, coverFormat)
# print(response)
# 获取用户的点赞列表
# secUid = "MS4wLjABAAAAq1iRXNduFZpY301UkVpJ1eQT60_NiWS9QQSeNqmNQEDJp0pOF8cpleNEdiJx5_IU"
# cursor = 0
# count = 30
# coverFormat = 2
# response = await self.fetch_user_like(secUid, cursor, count, coverFormat)
# print(response)
# 获取用户的收藏列表
# cookie = "ttwid=1%7Cbf2dxDZQ2bEXkDZCxVuIuo_p3DnLZydqkW-M-yEHrQ8%7C1712658199%7C4163e4b82a08fd8160ec7d3959a3c07b2dea47b1184fd4e35619ba732a48c1f7; tt_chain_token=AlPMTLvrUZOd3f21B8dbGQ==; tiktok_webapp_theme=light; odin_tt=70b04a203cd043dbe532561dd854ce106b9c0253acf3c105969313c9f58b97bafd88ba18dd551651e4a4bf92d02067b281ca82d4a75acb17b95d7d39103aa4da0f3597fe75e56404891f651147b9bc3f; passport_csrf_token=fd79484bbdda182d711ad15fa11c059d; passport_csrf_token_default=fd79484bbdda182d711ad15fa11c059d; multi_sids=7262344915936658478%3Aa5b1dc5805dee79797d4470ffa822e45; cmpl_token=AgQQAPNSF-RO0rT35p6-9h0X_5yvjXFav6jZYNCqgQ; sid_guard=a5b1dc5805dee79797d4470ffa822e45%7C1711264624%7C15552000%7CFri%2C+20-Sep-2024+07%3A17%3A04+GMT; uid_tt=59c2975ea05dc11e9f7f548a658d65348935cb555d188c432fe2e5d0bd6b75e1; uid_tt_ss=59c2975ea05dc11e9f7f548a658d65348935cb555d188c432fe2e5d0bd6b75e1; sid_tt=a5b1dc5805dee79797d4470ffa822e45; sessionid=a5b1dc5805dee79797d4470ffa822e45; sessionid_ss=a5b1dc5805dee79797d4470ffa822e45; sid_ucp_v1=1.0.0-KDNkZDkzMWRkNzMwOTU4NWVjZWNjZDAxZGE0NDgyNDJlYWQ3MTU4YjcKIAiuiJO2k4fC5GQQ8Kb_rwYYswsgDDDCkaSmBjgBQOoHEAQaB3VzZWFzdDUiIGE1YjFkYzU4MDVkZWU3OTc5N2Q0NDcwZmZhODIyZTQ1; ssid_ucp_v1=1.0.0-KDNkZDkzMWRkNzMwOTU4NWVjZWNjZDAxZGE0NDgyNDJlYWQ3MTU4YjcKIAiuiJO2k4fC5GQQ8Kb_rwYYswsgDDDCkaSmBjgBQOoHEAQaB3VzZWFzdDUiIGE1YjFkYzU4MDVkZWU3OTc5N2Q0NDcwZmZhODIyZTQ1; store-idc=useast5; store-country-code=us; store-country-code-src=uid; tt-target-idc=useast5; tt-target-idc-sign=hG594kSqj5ODjFKxKHu5O7_f0V5snMn-pNH9xbtYUWPjnSrq6LBUTwNfeMfgVdLAPiOuLNMMg5XQw54qK0h3--UlLvHwwkLKKvtGRPKb0PV3sJSg_M6EAAMDXHVz8dYwteofNu4j7ojwk10lEmZpuzXaAT574unoAu3gk-gJV9DlFkw-kwD_rHOkelP6Lk0Gw1fGDY3Om20-0tVFVJ8mmEOZtjkxnwgZ8-Eo707aT1lGidJ8PZKIbNkgo3udCLpy3rSl-LuUsYiPOfTwg1Ih7nv4Y2OeBOG5w21c9_6Yeo9SGcjf2gtuwQASkjWDy1vLxipf0cYrfFhVlOk2xgZi-RgbhQg62kiYSUe-Mv0zgmYPTTPXMz9okU3Y3IEklz-2adsaZY5kW8qJRBYp1yUhXPr5e7rf04Clj453VxYO5urfI935gMyB8p1KXpJrSaO3OLyZ2YKyNLnEGXBCMBbDzCOrMj8guP42JPOuf-fTIJkYhfHpqqTBHj3h7st22jY3; last_login_method=email; msToken=uhdWmwVzRuGO6X5Qzob8gV2OmXcf6t0FAQeHHIQAx__ZNZubBNAbfK3W7T-E4EN8_OfagmkDu_qAwo6MSaq-qhtPel9tndR7ZaNzoIhsyueXfEYtl8AfTxR9JZk5UemPY7GQ; tt_csrf_token=kOLsNXq3-r4m3r3K1YME0Ga961eYwydSITsM; ak_bmsc=A665D13C6586D765DDF447442952896E~000000000000000000000000000000~YAAQyuIsF15v8sCOAQAAKNIGwhcV5kjZ/zGNalPWSyQXieZLY9SKy2VxEzWDzgy/YRSQLTVBC9WBFE+RXuayfyihX1Rbsm+DQ4sTtTzHIUY20Yu1tWQjcakcvZXtfLp7I0q6fcP9ai53N1iG5wwovNyFJ1eyMQVLm650J2PK59+7DcZn3kmOAHjMl6plVx6mVRoTqXcRte4b5JDtORNbXAbjSgbQiJK5fpJPSFKd19j3C/7hJij3mFvcVMa7hlBIDYlDAshlyrykE/MaoRof0RRZCNuJfO45ZdftS5+YYhuLZXbUsaJXsj3DjmaZj90kNyhovKh4AepeXm6FeC2Mejp4nCILsF0c9nZmTl2jh+TerAffOmdwieV2UejJgfAIcSXrZ1dVWtu0dQ==; passport_fe_beating_status=true; perf_feed_cache={%22expireTimestamp%22:1712822400000%2C%22itemIds%22:[%227351502927267171630%22%2C%227349225060231679275%22%2C%227351880657787800864%22]}; bm_sv=0B6DC970D9D01727EB672641008F9C9D~YAAQxeIsFy0Bw5iOAQAAOChiwhecEatJ7PnT5D7Z1Cr2pcpz/MkHKvKSfCtcnD+888hhDkzPART3agFPgqxJzjotbMOL9vMYCaF331W/Ahjx+/L9pVnvOqKVyhwYw58LJG1NgB2tliqi18Exun04pomLtD195iDMfNbGdYmGt7WrtWumTuI7qEjkR4QAClrPNP1JFYUGT9A29+iGzc+/wqDZiG+0AGXUlTEBmIKM398S9DqSWpJrRvNODjMe7Fg2~1; _waftokenid=eyJ2Ijp7ImEiOiJweEtDTWVrOENLZEZyMDY5bzE1T01KL2RncVNSVWlzZ0xtQVAyZXJDeWw0PSIsImIiOjE3MTI2NTgxMjYsImMiOiJJTU94cGtQS1FuSG1XbVNOYUx0Yk5ldndQR2w5RkxIbkJXaEExOTVPbzFZPSJ9LCJzIjoiL3JDZDl1dXVIWkRvSTc2M1dLM2VHR2hNQU9vbGJGKy9oMFdtTXE3Zkgwaz0ifQ; msToken=J4gOwTXU0OEU6MH3VjhUVmCYMNGqtznsRSDxyl8KNlfSTPM7fMYYXHPPIpldHRFUx_fPfyhVclvrX6WnrNdxk3GzK_7BHmhIIkQbQ1zSSnqY2Phn6QduvErSGvCQyvxwTctWzslT3BqiIg=="
# secUid = "MS4wLjABAAAAq1iRXNduFZpY301UkVpJ1eQT60_NiWS9QQSeNqmNQEDJp0pOF8cpleNEdiJx5_IU"
# cursor = 0
# count = 30
# coverFormat = 2
# response = await self.fetch_user_collect(cookie, secUid, cursor, count, coverFormat)
# print(response)
# 获取用户的播放列表
# secUid = "MS4wLjABAAAAtGboV-mJHSIQqh-SsG30QKweGhSqkr4xJLq1qqgAWDzu3vDO5LUhUcCP4UEY5LwC"
# cursor = 0
# count = 30
# response = await self.fetch_user_play_list(secUid, cursor, count)
# print(response)
# 获取用户的合辑列表
# mixId = "7101538765474106158"
# cursor = 0
# count = 30
# response = await self.fetch_user_mix(mixId, cursor, count)
# print(response)
# 获取作品的评论列表
# aweme_id = "7304809083817774382"
# cursor = 0
# count = 20
# current_region = ""
# response = await self.fetch_post_comment(aweme_id, cursor, count, current_region)
# print(response)
# 获取作品的评论回复列表
# item_id = "7304809083817774382"
# comment_id = "7304877760886588191"
# cursor = 0
# count = 20
# current_region = ""
# response = await self.fetch_post_comment_reply(item_id, comment_id, cursor, count, current_region)
# print(response)
# 获取用户的关注列表
# secUid = "MS4wLjABAAAAtGboV-mJHSIQqh-SsG30QKweGhSqkr4xJLq1qqgAWDzu3vDO5LUhUcCP4UEY5LwC"
# count = 30
# maxCursor = 0
# minCursor = 0
# response = await self.fetch_user_follow(secUid, count, maxCursor, minCursor)
# print(response)
# 获取用户的粉丝列表
# secUid = "MS4wLjABAAAAtGboV-mJHSIQqh-SsG30QKweGhSqkr4xJLq1qqgAWDzu3vDO5LUhUcCP4UEY5LwC"
# count = 30
# maxCursor = 0
# minCursor = 0
# response = await self.fetch_user_fans(secUid, count, maxCursor, minCursor)
# print(response)
"""-------------------------------------------------------utils接口列表-------------------------------------------------------"""
# # 生成真实msToken
# response = await self.fetch_real_msToken()
# print(response)
# 生成ttwid
# cookie = "ttwid=1%7Cbf2dxDZQ2bEXkDZCxVuIuo_p3DnLZydqkW-M-yEHrQ8%7C1712664278%7C6ed45f1bb91c86eda1e08c6f60da7898591ed368ee7feece0a78f48d6e734d71; tt_chain_token=AlPMTLvrUZOd3f21B8dbGQ==; tiktok_webapp_theme=light; odin_tt=dada47a81f211d932ae6c5a0dc99b4e87029d27b49b0fc4a83faf6bc91bebddcf3665c25f1a7c0934dc27c89ab2a7151e116cc22110c24b927051de11b54c3e66e6567dba55e88d2f92f52a7b857d27f; passport_csrf_token=fd79484bbdda182d711ad15fa11c059d; passport_csrf_token_default=fd79484bbdda182d711ad15fa11c059d; multi_sids=7262344915936658478%3Aa5b1dc5805dee79797d4470ffa822e45; cmpl_token=AgQQAPNSF-RO0rT35p6-9h0X_5yvjXFav6jZYNCqgQ; sid_guard=a5b1dc5805dee79797d4470ffa822e45%7C1711264624%7C15552000%7CFri%2C+20-Sep-2024+07%3A17%3A04+GMT; uid_tt=59c2975ea05dc11e9f7f548a658d65348935cb555d188c432fe2e5d0bd6b75e1; uid_tt_ss=59c2975ea05dc11e9f7f548a658d65348935cb555d188c432fe2e5d0bd6b75e1; sid_tt=a5b1dc5805dee79797d4470ffa822e45; sessionid=a5b1dc5805dee79797d4470ffa822e45; sessionid_ss=a5b1dc5805dee79797d4470ffa822e45; sid_ucp_v1=1.0.0-KDNkZDkzMWRkNzMwOTU4NWVjZWNjZDAxZGE0NDgyNDJlYWQ3MTU4YjcKIAiuiJO2k4fC5GQQ8Kb_rwYYswsgDDDCkaSmBjgBQOoHEAQaB3VzZWFzdDUiIGE1YjFkYzU4MDVkZWU3OTc5N2Q0NDcwZmZhODIyZTQ1; ssid_ucp_v1=1.0.0-KDNkZDkzMWRkNzMwOTU4NWVjZWNjZDAxZGE0NDgyNDJlYWQ3MTU4YjcKIAiuiJO2k4fC5GQQ8Kb_rwYYswsgDDDCkaSmBjgBQOoHEAQaB3VzZWFzdDUiIGE1YjFkYzU4MDVkZWU3OTc5N2Q0NDcwZmZhODIyZTQ1; store-idc=useast5; store-country-code=us; store-country-code-src=uid; tt-target-idc=useast5; tt-target-idc-sign=hG594kSqj5ODjFKxKHu5O7_f0V5snMn-pNH9xbtYUWPjnSrq6LBUTwNfeMfgVdLAPiOuLNMMg5XQw54qK0h3--UlLvHwwkLKKvtGRPKb0PV3sJSg_M6EAAMDXHVz8dYwteofNu4j7ojwk10lEmZpuzXaAT574unoAu3gk-gJV9DlFkw-kwD_rHOkelP6Lk0Gw1fGDY3Om20-0tVFVJ8mmEOZtjkxnwgZ8-Eo707aT1lGidJ8PZKIbNkgo3udCLpy3rSl-LuUsYiPOfTwg1Ih7nv4Y2OeBOG5w21c9_6Yeo9SGcjf2gtuwQASkjWDy1vLxipf0cYrfFhVlOk2xgZi-RgbhQg62kiYSUe-Mv0zgmYPTTPXMz9okU3Y3IEklz-2adsaZY5kW8qJRBYp1yUhXPr5e7rf04Clj453VxYO5urfI935gMyB8p1KXpJrSaO3OLyZ2YKyNLnEGXBCMBbDzCOrMj8guP42JPOuf-fTIJkYhfHpqqTBHj3h7st22jY3; last_login_method=email; msToken=t3WAFkKaIqii9eMTPHdks49bnQnrcb-qA8i0K6CbTF-2MBdObKnGfU3icz53Wot781BQXPL4PT4ZI2nIFacyECthXfFPg6JrkghC_Yn7pf6aut5YU7PpqeuAa3Ipg-1OAYS2rF3UrQOetw==; tt_csrf_token=kOLsNXq3-r4m3r3K1YME0Ga961eYwydSITsM; passport_fe_beating_status=true; ak_bmsc=24B67BCA11EE8356DD40E4F7E23FC910~000000000000000000000000000000~YAAQyOIsF//Uz6COAQAAe/5/whdqkLc0AH5d8i2rlJjtjjMuK1K+tU3kWANu/7PWSoOfbjdVZi9uFRKsuQj1DU0Ztew9uihAT+jWEvqNo1cbee/FRwJLMZk80po2nI/GwpLKd51dHN5HW7dWc58oeWwpIOh/g2uVkXJjtVotM4sBcEYxO1i9fpOF0bgOOVXrElmpqaFVfWBJc0iBBHHBZEAHNSIGHjPHY7j+Gy4YMoldkJKJbMVtRWa3ChfagC7vgWsiy2sb3lDGRbSQgjYj128i2lwoPtbcA3bzLZPd6jseTgRHm01arjyMJh3RrhDUyu71D1jMPuWhVkAjAV21vrsJFmgFWQ3Ryv23yDa6kSq7qRBLQAxA6rYcYBRNPdoNOAaiLt5IBCHiWQ==; bm_sv=FB0D8C84058596C33C95C34477181900~YAAQ3+IsF1bJy52OAQAAIOy+whed7uxWbcSbX9ww9mH0hyPTbQLecKO8qrRli6xSOco/iX7wxO9RXW6ikifIc2RFIbvzAZg3j58QNuEJ5hXrIia0yc0T/j1cCjhqBypDWXc6KaC716f293K5lbOFoD7QMFCA3K4YsMK3mJa2KaBzOvwOTdbpmow8PM0fKAXwSxJDPweGCzxZhiMNC+H/TFscBYb5HdDZ7PkuiN97YPtVEtfMd4WcpHos3e5dFJS+~1; living_user_id=743289106087; csrfToken=ndffXer7-a5_J5Me7ffRmSSJfzfMDmX3B6oM; perf_feed_cache={%22expireTimestamp%22:1712836800000%2C%22itemIds%22:[%227348432053190380842%22%2C%227352682225256762666%22%2C%227350485674086157611%22]}; csrf_session_id=120d8aacffb06addd01cb40859003c8e; msToken=8tbUEEpe8rNjEkOrH6M5v-RXjSUaCAARr-Knyk11POdQHvjVvQjbXEbEP4vcAf2ftbB7ycHOz7LlQbADkPURuVvseAstrFbSmxY3uVJOLAgS6O9h3BnXmpHH_kPw4xXgF2zDv4E-U58JZw=="
# response = await self.gen_ttwid(cookie)
# print(response)
# 生成xbogus
# url = "https://www.tiktok.com/api/item/detail/?WebIdLastTime=1712665533&aid=1988&app_language=en&app_name=tiktok_web&browser_language=en-US&browser_name=Mozilla&browser_online=true&browser_platform=Win32&browser_version=5.0%20%28Windows%29&channel=tiktok_web&cookie_enabled=true&device_id=7349090360347690538&device_platform=web_pc&focus_state=true&from_page=user&history_len=4&is_fullscreen=false&is_page_visible=true&language=en&os=windows&priority_region=US&referer=&region=US&root_referer=https%3A%2F%2Fwww.tiktok.com%2F&screen_height=1080&screen_width=1920&webcast_language=en&tz_name=America%2FTijuana&msToken=AYFCEapCLbMrS8uTLBoYdUMeeVLbCdFQ_QF_-OcjzJw1CPr4JQhWUtagy0k4a9IITAqi5Qxr2Vdh9mgCbyGxTnvWLa4ZVY6IiSf6lcST-tr0IXfl-r_ZTpzvWDoQfqOVsWCTlSNkhAwB-tap5g==&itemId=7339393672959757570"
# user_agent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3"
# response = await self.gen_xbogus(url, user_agent)
# print(response)
# 提取单个用户secUid
# url = "https://www.tiktok.com/@tiktok"
# response = await self.get_sec_user_id(url)
# print(response)
# 提取多个用户secUid
# urls = ["https://www.tiktok.com/@tiktok", "https://www.tiktok.com/@taylorswift"]
# response = await self.get_all_sec_user_id(urls)
# print(response)
# 提取单个作品id
# url = "https://www.tiktok.com/@taylorswift/video/7162153915952352558"
# response = await self.get_aweme_id(url)
# print(response)
# 提取多个作品id
# urls = ["https://www.tiktok.com/@taylorswift/video/7162153915952352558", "https://www.tiktok.com/@taylorswift/video/7137077445680745771"]
# response = await self.get_all_aweme_id(urls)
# print(response)
# 获取用户unique_id
# url = "https://www.tiktok.com/@tiktok"
# response = await self.get_unique_id(url)
# print(response)
# 获取多个用户unique_id
# urls = ["https://www.tiktok.com/@tiktok", "https://www.tiktok.com/@taylorswift"]
# response = await self.get_all_unique_id(urls)
# print(response)
# 占位
pass
if __name__ == "__main__":
# 初始化
TikTokWebCrawler = TikTokWebCrawler()
# 开始时间
start = time.time()
asyncio.run(TikTokWebCrawler.main())
# 结束时间
end = time.time()
print(f"耗时:{end - start}")

View File

@ -0,0 +1,105 @@
# ==============================================================================
# Copyright (C) 2021 Evil0ctal
#
# This file is part of the Douyin_TikTok_Download_API project.
#
# This project is licensed under the Apache License 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at:
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
#         __
#        />  フ
#       |  _  _ l
#       ` ミ_x
#      /      | Feed me Stars ⭐
#     /  ヽ   ノ
#     │  | | |
#  / ̄|   | | |
#  | ( ̄ヽ__ヽ_)__)
#  \二つ
# ==============================================================================
#
# Contributor Link:
# - https://github.com/Evil0ctal
# - https://github.com/Johnserf-Seed
#
# ==============================================================================
class APIError(Exception):
"""基本API异常类其他API异常都会继承这个类"""
def __init__(self, status_code=None):
self.status_code = status_code
print(
"程序出现异常,请检查错误信息。"
)
def display_error(self):
"""显示错误信息和状态码(如果有的话)"""
return f"Error: {self.args[0]}." + (
f" Status Code: {self.status_code}." if self.status_code else ""
)
class APIConnectionError(APIError):
"""当与API的连接出现问题时抛出"""
def display_error(self):
return f"API Connection Error: {self.args[0]}."
class APIUnavailableError(APIError):
"""当API服务不可用时抛出例如维护或超时"""
def display_error(self):
return f"API Unavailable Error: {self.args[0]}."
class APINotFoundError(APIError):
"""当API端点不存在时抛出"""
def display_error(self):
return f"API Not Found Error: {self.args[0]}."
class APIResponseError(APIError):
"""当API返回的响应与预期不符时抛出"""
def display_error(self):
return f"API Response Error: {self.args[0]}."
class APIRateLimitError(APIError):
"""当达到API的请求速率限制时抛出"""
def display_error(self):
return f"API Rate Limit Error: {self.args[0]}."
class APITimeoutError(APIError):
"""当API请求超时时抛出"""
def display_error(self):
return f"API Timeout Error: {self.args[0]}."
class APIUnauthorizedError(APIError):
"""当API请求由于授权失败而被拒绝时抛出"""
def display_error(self):
return f"API Unauthorized Error: {self.args[0]}."
class APIRetryExhaustedError(APIError):
"""当API请求重试次数用尽时抛出"""
def display_error(self):
return f"API Retry Exhausted Error: {self.args[0]}."

169
crawlers/utils/logger.py Normal file
View File

@ -0,0 +1,169 @@
# ==============================================================================
# Copyright (C) 2021 Evil0ctal
#
# This file is part of the Douyin_TikTok_Download_API project.
#
# This project is licensed under the Apache License 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at:
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
#         __
#        />  フ
#       |  _  _ l
#       ` ミ_x
#      /      | Feed me Stars ⭐
#     /  ヽ   ノ
#     │  | | |
#  / ̄|   | | |
#  | ( ̄ヽ__ヽ_)__)
#  \二つ
# ==============================================================================
#
# Contributor Link:
# - https://github.com/Evil0ctal
# - https://github.com/Johnserf-Seed
#
# ==============================================================================
import threading
import time
import logging
import datetime
from pathlib import Path
from rich.logging import RichHandler
from logging.handlers import TimedRotatingFileHandler
class Singleton(type):
_instances = {} # 存储实例的字典
_lock: threading.Lock = threading.Lock() # 线程锁
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
def __call__(cls, *args, **kwargs):
"""
重写默认的类实例化方法当尝试创建类的一个新实例时此方法将被调用
如果已经有一个与参数匹配的实例存在则返回该实例否则创建一个新实例
"""
key = (cls, args, frozenset(kwargs.items()))
with cls._lock:
if key not in cls._instances:
instance = super().__call__(*args, **kwargs)
cls._instances[key] = instance
return cls._instances[key]
@classmethod
def reset_instance(cls, *args, **kwargs):
"""
重置指定参数的实例这只是从 _instances 字典中删除实例的引用
并不真正删除该实例如果其他地方仍引用该实例它仍然存在且可用
"""
key = (cls, args, frozenset(kwargs.items()))
with cls._lock:
if key in cls._instances:
del cls._instances[key]
class LogManager(metaclass=Singleton):
def __init__(self):
if getattr(self, "_initialized", False): # 防止重复初始化
return
self.logger = logging.getLogger("TikHub_Crawlers")
self.logger.setLevel(logging.INFO)
self.log_dir = None
self._initialized = True
def setup_logging(self, level=logging.INFO, log_to_console=False, log_path=None):
self.logger.handlers.clear()
self.logger.setLevel(level)
if log_to_console:
ch = RichHandler(
show_time=False,
show_path=False,
markup=True,
keywords=(RichHandler.KEYWORDS or []) + ["STREAM"],
rich_tracebacks=True,
)
ch.setFormatter(logging.Formatter("{message}", style="{", datefmt="[%X]"))
self.logger.addHandler(ch)
if log_path:
self.log_dir = Path(log_path)
self.ensure_log_dir_exists(self.log_dir)
log_file_name = datetime.datetime.now().strftime("%Y-%m-%d-%H-%M-%S.log")
log_file = self.log_dir.joinpath(log_file_name)
fh = TimedRotatingFileHandler(
log_file, when="midnight", interval=1, backupCount=99, encoding="utf-8"
)
fh.setFormatter(
logging.Formatter(
"%(asctime)s - %(name)s - %(levelname)s - %(message)s"
)
)
self.logger.addHandler(fh)
@staticmethod
def ensure_log_dir_exists(log_path: Path):
log_path.mkdir(parents=True, exist_ok=True)
def clean_logs(self, keep_last_n=10):
"""保留最近的n个日志文件并删除其他文件"""
if not self.log_dir:
return
# self.shutdown()
all_logs = sorted(self.log_dir.glob("*.log"))
if keep_last_n == 0:
files_to_delete = all_logs
else:
files_to_delete = all_logs[:-keep_last_n]
for log_file in files_to_delete:
try:
log_file.unlink()
except PermissionError:
self.logger.warning(
f"无法删除日志文件 {log_file}, 它正被另一个进程使用"
)
def shutdown(self):
for handler in self.logger.handlers:
handler.close()
self.logger.removeHandler(handler)
self.logger.handlers.clear()
time.sleep(1) # 确保文件被释放
def log_setup(log_to_console=True):
logger = logging.getLogger("TikHub_Crawlers")
if logger.hasHandlers():
# logger已经被设置不做任何操作
return logger
# 创建临时的日志目录
temp_log_dir = Path("./logs")
temp_log_dir.mkdir(exist_ok=True)
# 初始化日志管理器
log_manager = LogManager()
log_manager.setup_logging(
level=logging.INFO, log_to_console=log_to_console, log_path=temp_log_dir
)
# 只保留1000个日志文件
log_manager.clean_logs(1000)
return logger
logger = log_setup()

394
crawlers/utils/utils.py Normal file
View File

@ -0,0 +1,394 @@
# ==============================================================================
# Copyright (C) 2021 Evil0ctal
#
# This file is part of the Douyin_TikTok_Download_API project.
#
# This project is licensed under the Apache License 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at:
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
#         __
#        />  フ
#       |  _  _ l
#       ` ミ_x
#      /      | Feed me Stars ⭐
#     /  ヽ   ノ
#     │  | | |
#  / ̄|   | | |
#  | ( ̄ヽ__ヽ_)__)
#  \二つ
# ==============================================================================
#
# Contributor Link:
# - https://github.com/Evil0ctal
# - https://github.com/Johnserf-Seed
#
# ==============================================================================
import re
import sys
import random
import secrets
import datetime
import browser_cookie3
import importlib_resources
from pydantic import BaseModel
from urllib.parse import quote, urlencode # URL编码
from typing import Union, Any
from pathlib import Path
# 生成一个 16 字节的随机字节串 (Generate a random byte string of 16 bytes)
seed_bytes = secrets.token_bytes(16)
# 将字节字符串转换为整数 (Convert the byte string to an integer)
seed_int = int.from_bytes(seed_bytes, "big")
# 设置随机种子 (Seed the random module)
random.seed(seed_int)
# 将模型实例转换为字典
def model_to_query_string(model: BaseModel) -> str:
model_dict = model.dict()
# 使用urlencode进行URL编码
query_string = urlencode(model_dict)
return query_string
def gen_random_str(randomlength: int) -> str:
"""
根据传入长度产生随机字符串 (Generate a random string based on the given length)
Args:
randomlength (int): 需要生成的随机字符串的长度 (The length of the random string to be generated)
Returns:
str: 生成的随机字符串 (The generated random string)
"""
base_str = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+-"
return "".join(random.choice(base_str) for _ in range(randomlength))
def get_timestamp(unit: str = "milli"):
"""
根据给定的单位获取当前时间 (Get the current time based on the given unit)
Args:
unit (str): 时间单位可以是 "milli""sec""min"
(The time unit, which can be "milli", "sec", "min", etc.)
Returns:
int: 根据给定单位的当前时间 (The current time based on the given unit)
"""
now = datetime.datetime.utcnow() - datetime.datetime(1970, 1, 1)
if unit == "milli":
return int(now.total_seconds() * 1000)
elif unit == "sec":
return int(now.total_seconds())
elif unit == "min":
return int(now.total_seconds() / 60)
else:
raise ValueError("Unsupported time unit")
def timestamp_2_str(
timestamp: Union[str, int, float], format: str = "%Y-%m-%d %H-%M-%S"
) -> str:
"""
UNIX 时间戳转换为格式化字符串 (Convert a UNIX timestamp to a formatted string)
Args:
timestamp (int): 要转换的 UNIX 时间戳 (The UNIX timestamp to be converted)
format (str, optional): 返回的日期时间字符串的格式
默认为 '%Y-%m-%d %H-%M-%S'
(The format for the returned date-time string
Defaults to '%Y-%m-%d %H-%M-%S')
Returns:
str: 格式化的日期时间字符串 (The formatted date-time string)
"""
if timestamp is None or timestamp == "None":
return ""
if isinstance(timestamp, str):
if len(timestamp) == 30:
return datetime.datetime.strptime(timestamp, "%a %b %d %H:%M:%S %z %Y")
return datetime.datetime.fromtimestamp(float(timestamp)).strftime(format)
def num_to_base36(num: int) -> str:
"""数字转换成base32 (Convert number to base 36)"""
base_str = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
if num == 0:
return "0"
base36 = []
while num:
num, i = divmod(num, 36)
base36.append(base_str[i])
return "".join(reversed(base36))
def split_set_cookie(cookie_str: str) -> str:
"""
拆分Set-Cookie字符串并拼接 (Split the Set-Cookie string and concatenate)
Args:
cookie_str (str): 待拆分的Set-Cookie字符串 (The Set-Cookie string to be split)
Returns:
str: 拼接后的Cookie字符串 (Concatenated cookie string)
"""
# 判断是否为字符串 / Check if it's a string
if not isinstance(cookie_str, str):
raise TypeError("`set-cookie` must be str")
# 拆分Set-Cookie字符串,避免错误地在expires字段的值中分割字符串 (Split the Set-Cookie string, avoiding incorrect splitting on the value of the 'expires' field)
# 拆分每个Cookie字符串只获取第一个分段即key=value部分 / Split each Cookie string, only getting the first segment (i.e., key=value part)
# 拼接所有的Cookie (Concatenate all cookies)
return ";".join(
cookie.split(";")[0] for cookie in re.split(", (?=[a-zA-Z])", cookie_str)
)
def split_dict_cookie(cookie_dict: dict) -> str:
return "; ".join(f"{key}={value}" for key, value in cookie_dict.items())
def extract_valid_urls(inputs: Union[str, list[str]]) -> Union[str, list[str], None]:
"""从输入中提取有效的URL (Extract valid URLs from input)
Args:
inputs (Union[str, list[str]]): 输入的字符串或字符串列表 (Input string or list of strings)
Returns:
Union[str, list[str]]: 提取出的有效URL或URL列表 (Extracted valid URL or list of URLs)
"""
url_pattern = re.compile(r"https?://\S+")
# 如果输入是单个字符串
if isinstance(inputs, str):
match = url_pattern.search(inputs)
return match.group(0) if match else None
# 如果输入是字符串列表
elif isinstance(inputs, list):
valid_urls = []
for input_str in inputs:
matches = url_pattern.findall(input_str)
if matches:
valid_urls.extend(matches)
return valid_urls
def _get_first_item_from_list(_list) -> list:
# 检查是否是列表 (Check if it's a list)
if _list and isinstance(_list, list):
# 如果列表里第一个还是列表则提起每一个列表的第一个值
# (If the first one in the list is still a list then bring up the first value of each list)
if isinstance(_list[0], list):
return [inner[0] for inner in _list if inner]
# 如果只是普通列表,则返回这个列表包含的第一个项目作为新列表
# (If it's just a regular list, return the first item wrapped in a list)
else:
return [_list[0]]
return []
def get_resource_path(filepath: str):
"""获取资源文件的路径 (Get the path of the resource file)
Args:
filepath: str: 文件路径 (file path)
"""
return importlib_resources.files("f2") / filepath
def replaceT(obj: Union[str, Any]) -> Union[str, Any]:
"""
替换文案非法字符 (Replace illegal characters in the text)
Args:
obj (str): 传入对象 (Input object)
Returns:
new: 处理后的内容 (Processed content)
"""
reSub = r"[^\u4e00-\u9fa5a-zA-Z0-9#]"
if isinstance(obj, list):
return [re.sub(reSub, "_", i) for i in obj]
if isinstance(obj, str):
return re.sub(reSub, "_", obj)
return obj
# raise TypeError("输入应为字符串或字符串列表")
def split_filename(text: str, os_limit: dict) -> str:
"""
根据操作系统的字符限制分割文件名并用 '......' 代替
Args:
text (str): 要计算的文本
os_limit (dict): 操作系统的字符限制字典
Returns:
str: 分割后的文本
"""
# 获取操作系统名称和文件名长度限制
os_name = sys.platform
filename_length_limit = os_limit.get(os_name, 200)
# 计算中文字符长度(中文字符长度*3
chinese_length = sum(1 for char in text if "\u4e00" <= char <= "\u9fff") * 3
# 计算英文字符长度
english_length = sum(1 for char in text if char.isalpha())
# 计算下划线数量
num_underscores = text.count("_")
# 计算总长度
total_length = chinese_length + english_length + num_underscores
# 如果总长度超过操作系统限制或手动设置的限制,则根据限制进行分割
if total_length > filename_length_limit:
split_index = min(total_length, filename_length_limit) // 2 - 6
split_text = text[:split_index] + "......" + text[-split_index:]
return split_text
else:
return text
def ensure_path(path: Union[str, Path]) -> Path:
"""确保路径是一个Path对象 (Ensure the path is a Path object)"""
return Path(path) if isinstance(path, str) else path
def get_cookie_from_browser(browser_choice: str, domain: str = "") -> dict:
"""
根据用户选择的浏览器获取domain的cookie
Args:
browser_choice (str): 用户选择的浏览器名称
Returns:
str: *.domain的cookie值
"""
if not browser_choice or not domain:
return ""
BROWSER_FUNCTIONS = {
"chrome": browser_cookie3.chrome,
"firefox": browser_cookie3.firefox,
"edge": browser_cookie3.edge,
"opera": browser_cookie3.opera,
"opera_gx": browser_cookie3.opera_gx,
"safari": browser_cookie3.safari,
"chromium": browser_cookie3.chromium,
"brave": browser_cookie3.brave,
"vivaldi": browser_cookie3.vivaldi,
"librewolf": browser_cookie3.librewolf,
}
cj_function = BROWSER_FUNCTIONS.get(browser_choice)
cj = cj_function(domain_name=domain)
cookie_value = {c.name: c.value for c in cj if c.domain.endswith(domain)}
return cookie_value
def check_invalid_naming(
naming: str, allowed_patterns: list, allowed_separators: list
) -> list:
"""
检查命名是否符合命名模板 (Check if the naming conforms to the naming template)
Args:
naming (str): 命名字符串 (Naming string)
allowed_patterns (list): 允许的模式列表 (List of allowed patterns)
allowed_separators (list): 允许的分隔符列表 (List of allowed separators)
Returns:
list: 无效的模式列表 (List of invalid patterns)
"""
if not naming or not allowed_patterns or not allowed_separators:
return []
temp_naming = naming
invalid_patterns = []
# 检查提供的模式是否有效
for pattern in allowed_patterns:
if pattern in temp_naming:
temp_naming = temp_naming.replace(pattern, "")
# 此时temp_naming应只包含分隔符
for char in temp_naming:
if char not in allowed_separators:
invalid_patterns.append(char)
# 检查连续的无效模式或分隔符
for pattern in allowed_patterns:
# 检查像"{xxx}{xxx}"这样的模式
if pattern + pattern in naming:
invalid_patterns.append(pattern + pattern)
for sep in allowed_patterns:
# 检查像"{xxx}-{xxx}"这样的模式
if pattern + sep + pattern in naming:
invalid_patterns.append(pattern + sep + pattern)
return invalid_patterns
def merge_config(
main_conf: dict = ...,
custom_conf: dict = ...,
**kwargs,
):
"""
合并配置参数使 CLI 参数优先级高于自定义配置自定义配置优先级高于主配置最终生成完整配置参数字典
Args:
main_conf (dict): 主配置参数字典
custom_conf (dict): 自定义配置参数字典
**kwargs: CLI 参数和其他额外的配置参数
Returns:
dict: 合并后的配置参数字典
"""
# 合并主配置和自定义配置
merged_conf = {}
for key, value in main_conf.items():
merged_conf[key] = value # 将主配置复制到合并后的配置中
for key, value in custom_conf.items():
if value is not None and value != "": # 只有值不为 None 和 空值,才进行合并
merged_conf[key] = value # 自定义配置参数会覆盖主配置中的同名参数
# 合并 CLI 参数与合并后的配置,确保 CLI 参数的优先级最高
for key, value in kwargs.items():
if key not in merged_conf: # 如果合并后的配置中没有这个键,则直接添加
merged_conf[key] = value
elif value is not None and value != "": # 如果值不为 None 和 空值,则进行合并
merged_conf[key] = value # CLI 参数会覆盖自定义配置和主配置中的同名参数
return merged_conf

View File

@ -1,5 +1,5 @@
[Unit]
Description=www/wwwroot/Douyin_TikTok_Download_API/web_app.py deamon
Description=Douyin_TikTok_Download_API deamon
After=rc-local.service
[Service]
@ -7,7 +7,7 @@ Type=simple
User=root
Group=root
WorkingDirectory=/www/wwwroot/Douyin_TikTok_Download_API
ExecStart=python3 web_app.py
ExecStart=python3 start.py
Restart=always
[Install]

View File

@ -1,14 +0,0 @@
[Unit]
Description=www/wwwroot/Douyin_TikTok_Download_API/web_api.py deamon
After=rc-local.service
[Service]
Type=simple
User=root
Group=root
WorkingDirectory=/www/wwwroot/Douyin_TikTok_Download_API
ExecStart=python3 web_api.py
Restart=always
[Install]
WantedBy=multi-user.target

View File

@ -9,7 +9,7 @@ services:
container_name: douyin_tiktok_download_api
restart: always
volumes:
- ./config.yml:/app/config.yml
- ./config.yaml:/crawlers/douyin/web/config.yaml:/crawlers/tiktok/web/config.yaml:/crawlers/tiktok/app/config.yaml
environment:
TZ: Asia/Shanghai
deploy:

View File

@ -1,36 +1,36 @@
aiohttp~=3.8.4
aiosignal==1.3.1
anyio==3.6.2
async-timeout==4.0.2
attrs==22.2.0
Brotli==1.0.9
charset-normalizer==3.0.1
click==8.1.3
colorama==0.4.6
Deprecated==1.2.13
fastapi~=0.92.0
frozenlist==1.3.3
h11==0.14.0
idna==3.4
limits==2.8.0
multidict==6.0.4
orjson==3.9.15
packaging==22.0
pydantic~=1.10.12
PyExecJS==1.5.1
pywebio==1.7.1
six==1.16.0
slowapi==0.1.7
sniffio==1.3.0
starlette~=0.25.0
tenacity==8.2.1
tornado==6.3.3
typing_extensions==4.8.0
ua-parser==0.16.1
user-agents==2.2.0
uvicorn==0.20.0
wrapt==1.15.0
yarl==1.8.2
httpx~=0.25.0
requests~=2.28.2
PyYAML~=6.0.1
aiofiles==23.2.1
annotated-types==0.6.0
anyio==4.3.0
browser-cookie3==0.19.1
certifi==2024.2.2
click==8.1.7
colorama==0.4.6
fastapi==0.110.2
h11==0.14.0
httpcore==1.0.5
httpx==0.27.0
idna==3.7
importlib_resources==6.4.0
lz4==4.3.3
markdown-it-py==3.0.0
mdurl==0.1.2
numpy==2.0.0rc1
pycryptodomex==3.20.0
pydantic==2.7.0
pydantic_core==2.18.1
pyfiglet==1.0.2
Pygments==2.17.2
pypng==0.20220715.0
pywebio==1.8.3
pywebio-battery==0.6.0
PyYAML==6.0.1
qrcode==7.4.2
rich==13.7.1
sniffio==1.3.1
starlette==0.37.2
tornado==6.4
typing_extensions==4.11.0
ua-parser==0.18.0
user-agents==2.2.0
uvicorn==0.29.0
websockets==12.0

1021
scraper.py

File diff suppressed because it is too large Load Diff

41
start.py Normal file
View File

@ -0,0 +1,41 @@
# ==============================================================================
# Copyright (C) 2021 Evil0ctal
#
# This file is part of the Douyin_TikTok_Download_API project.
#
# This project is licensed under the Apache License 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at:
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
#         __
#        />  フ
#       |  _  _ l
#       ` ミ_x
#      /      | Feed me Stars ⭐
#     /  ヽ   ノ
#     │  | | |
#  / ̄|   | | |
#  | ( ̄ヽ__ヽ_)__)
#  \二つ
# ==============================================================================
#
# Contributor Link, Thanks for your contribution:
# - https://github.com/Evil0ctal
# - https://github.com/Johnserf-Seed
# - https://github.com/Evil0ctal/Douyin_TikTok_Download_API/graphs/contributors
#
# ==============================================================================
from app.main import app
import uvicorn
if __name__ == '__main__':
uvicorn.run(app, host='0.0.0.0', port=80)

View File

@ -1,3 +1,3 @@
#!/bin/sh
python3 web_app.py &
python3 web_api.py
python3 start.py

View File

@ -1,72 +0,0 @@
// CodeMirror, copyright (c) by Marijn Haverbeke and others
// Distributed under an MIT license: https://codemirror.net/LICENSE
(function(mod) {
if (typeof exports == "object" && typeof module == "object") // CommonJS
mod(require("../../lib/codemirror"));
else if (typeof define == "function" && define.amd) // AMD
define(["../../lib/codemirror"], mod);
else // Plain browser env
mod(CodeMirror);
})(function(CodeMirror) {
"use strict";
var WRAP_CLASS = "CodeMirror-activeline";
var BACK_CLASS = "CodeMirror-activeline-background";
var GUTT_CLASS = "CodeMirror-activeline-gutter";
CodeMirror.defineOption("styleActiveLine", false, function(cm, val, old) {
var prev = old == CodeMirror.Init ? false : old;
if (val == prev) return
if (prev) {
cm.off("beforeSelectionChange", selectionChange);
clearActiveLines(cm);
delete cm.state.activeLines;
}
if (val) {
cm.state.activeLines = [];
updateActiveLines(cm, cm.listSelections());
cm.on("beforeSelectionChange", selectionChange);
}
});
function clearActiveLines(cm) {
for (var i = 0; i < cm.state.activeLines.length; i++) {
cm.removeLineClass(cm.state.activeLines[i], "wrap", WRAP_CLASS);
cm.removeLineClass(cm.state.activeLines[i], "background", BACK_CLASS);
cm.removeLineClass(cm.state.activeLines[i], "gutter", GUTT_CLASS);
}
}
function sameArray(a, b) {
if (a.length != b.length) return false;
for (var i = 0; i < a.length; i++)
if (a[i] != b[i]) return false;
return true;
}
function updateActiveLines(cm, ranges) {
var active = [];
for (var i = 0; i < ranges.length; i++) {
var range = ranges[i];
var option = cm.getOption("styleActiveLine");
if (typeof option == "object" && option.nonEmpty ? range.anchor.line != range.head.line : !range.empty())
continue
var line = cm.getLineHandleVisualStart(range.head.line);
if (active[active.length - 1] != line) active.push(line);
}
if (sameArray(cm.state.activeLines, active)) return;
cm.operation(function() {
clearActiveLines(cm);
for (var i = 0; i < active.length; i++) {
cm.addLineClass(active[i], "wrap", WRAP_CLASS);
cm.addLineClass(active[i], "background", BACK_CLASS);
cm.addLineClass(active[i], "gutter", GUTT_CLASS);
}
cm.state.activeLines = active;
});
}
function selectionChange(cm, sel) {
updateActiveLines(cm, sel.ranges);
}
});

File diff suppressed because one or more lines are too long

View File

@ -1,47 +0,0 @@
// CodeMirror, copyright (c) by Marijn Haverbeke and others
// Distributed under an MIT license: https://codemirror.net/LICENSE
(function(mod) {
if (typeof exports == "object" && typeof module == "object") // CommonJS
mod(require("../../lib/codemirror"))
else if (typeof define == "function" && define.amd) // AMD
define(["../../lib/codemirror"], mod)
else // Plain browser env
mod(CodeMirror)
})(function(CodeMirror) {
"use strict"
CodeMirror.defineOption("autoRefresh", false, function(cm, val) {
if (cm.state.autoRefresh) {
stopListening(cm, cm.state.autoRefresh)
cm.state.autoRefresh = null
}
if (val && cm.display.wrapper.offsetHeight == 0)
startListening(cm, cm.state.autoRefresh = {delay: val.delay || 250})
})
function startListening(cm, state) {
function check() {
if (cm.display.wrapper.offsetHeight) {
stopListening(cm, state)
if (cm.display.lastWrapHeight != cm.display.wrapper.clientHeight)
cm.refresh()
} else {
state.timeout = setTimeout(check, state.delay)
}
}
state.timeout = setTimeout(check, state.delay)
state.hurry = function() {
clearTimeout(state.timeout)
state.timeout = setTimeout(check, 50)
}
CodeMirror.on(window, "mouseup", state.hurry)
CodeMirror.on(window, "keyup", state.hurry)
}
function stopListening(_cm, state) {
clearTimeout(state.timeout)
CodeMirror.off(window, "mouseup", state.hurry)
CodeMirror.off(window, "keyup", state.hurry)
}
});

View File

@ -1,54 +0,0 @@
// CodeMirror, copyright (c) by Marijn Haverbeke and others
// Distributed under an MIT license: https://codemirror.net/LICENSE
// Modified by Weimin Wang
// jQuery required
(function(mod) {
// Plain browser env
mod(CodeMirror, "plain");
})(function(CodeMirror, env) {
if (!CodeMirror.modeURL) CodeMirror.modeURL = "../mode/%N/%N.js";
var loading = {};
function splitCallback(cont, n) {
var countDown = n;
return function() { if (--countDown == 0) cont(); };
}
function ensureDeps(mode, cont) {
var deps = CodeMirror.modes[mode].dependencies;
if (!deps) return cont();
var missing = [];
for (var i = 0; i < deps.length; ++i) {
if (!CodeMirror.modes.hasOwnProperty(deps[i]))
missing.push(deps[i]);
}
if (!missing.length) return cont();
var split = splitCallback(cont, missing.length);
for (var i = 0; i < missing.length; ++i)
CodeMirror.requireMode(missing[i], split);
}
CodeMirror.requireMode = function(mode, cont) {
if (typeof mode != "string") mode = mode.name;
if (CodeMirror.modes.hasOwnProperty(mode)) return ensureDeps(mode, cont);
if (loading.hasOwnProperty(mode)) return loading[mode].push(cont);
var file = CodeMirror.modeURL.replace(/%N/g, mode);
$.get(file, function (data){
let mode_func = new Function('CodeMirror', 'exports', 'define', data);
mode_func(CodeMirror, null, null);
var list = loading[mode] = [cont];
ensureDeps(mode, function() {
for (var i = 0; i < list.length; ++i) list[i]();
});
})
};
CodeMirror.autoLoadMode = function(instance, mode) {
if (!CodeMirror.modes.hasOwnProperty(mode))
CodeMirror.requireMode(mode, function() {
instance.setOption("mode", instance.getOption("mode"));
});
};
});

View File

@ -1,150 +0,0 @@
// CodeMirror, copyright (c) by Marijn Haverbeke and others
// Distributed under an MIT license: https://codemirror.net/LICENSE
(function(mod) {
if (typeof exports == "object" && typeof module == "object") // CommonJS
mod(require("../../lib/codemirror"));
else if (typeof define == "function" && define.amd) // AMD
define(["../../lib/codemirror"], mod);
else // Plain browser env
mod(CodeMirror);
})(function(CodeMirror) {
var ie_lt8 = /MSIE \d/.test(navigator.userAgent) &&
(document.documentMode == null || document.documentMode < 8);
var Pos = CodeMirror.Pos;
var matching = {"(": ")>", ")": "(<", "[": "]>", "]": "[<", "{": "}>", "}": "{<", "<": ">>", ">": "<<"};
function bracketRegex(config) {
return config && config.bracketRegex || /[(){}[\]]/
}
function findMatchingBracket(cm, where, config) {
var line = cm.getLineHandle(where.line), pos = where.ch - 1;
var afterCursor = config && config.afterCursor
if (afterCursor == null)
afterCursor = /(^| )cm-fat-cursor($| )/.test(cm.getWrapperElement().className)
var re = bracketRegex(config)
// A cursor is defined as between two characters, but in in vim command mode
// (i.e. not insert mode), the cursor is visually represented as a
// highlighted box on top of the 2nd character. Otherwise, we allow matches
// from before or after the cursor.
var match = (!afterCursor && pos >= 0 && re.test(line.text.charAt(pos)) && matching[line.text.charAt(pos)]) ||
re.test(line.text.charAt(pos + 1)) && matching[line.text.charAt(++pos)];
if (!match) return null;
var dir = match.charAt(1) == ">" ? 1 : -1;
if (config && config.strict && (dir > 0) != (pos == where.ch)) return null;
var style = cm.getTokenTypeAt(Pos(where.line, pos + 1));
var found = scanForBracket(cm, Pos(where.line, pos + (dir > 0 ? 1 : 0)), dir, style || null, config);
if (found == null) return null;
return {from: Pos(where.line, pos), to: found && found.pos,
match: found && found.ch == match.charAt(0), forward: dir > 0};
}
// bracketRegex is used to specify which type of bracket to scan
// should be a regexp, e.g. /[[\]]/
//
// Note: If "where" is on an open bracket, then this bracket is ignored.
//
// Returns false when no bracket was found, null when it reached
// maxScanLines and gave up
function scanForBracket(cm, where, dir, style, config) {
var maxScanLen = (config && config.maxScanLineLength) || 10000;
var maxScanLines = (config && config.maxScanLines) || 1000;
var stack = [];
var re = bracketRegex(config)
var lineEnd = dir > 0 ? Math.min(where.line + maxScanLines, cm.lastLine() + 1)
: Math.max(cm.firstLine() - 1, where.line - maxScanLines);
for (var lineNo = where.line; lineNo != lineEnd; lineNo += dir) {
var line = cm.getLine(lineNo);
if (!line) continue;
var pos = dir > 0 ? 0 : line.length - 1, end = dir > 0 ? line.length : -1;
if (line.length > maxScanLen) continue;
if (lineNo == where.line) pos = where.ch - (dir < 0 ? 1 : 0);
for (; pos != end; pos += dir) {
var ch = line.charAt(pos);
if (re.test(ch) && (style === undefined || cm.getTokenTypeAt(Pos(lineNo, pos + 1)) == style)) {
var match = matching[ch];
if (match && (match.charAt(1) == ">") == (dir > 0)) stack.push(ch);
else if (!stack.length) return {pos: Pos(lineNo, pos), ch: ch};
else stack.pop();
}
}
}
return lineNo - dir == (dir > 0 ? cm.lastLine() : cm.firstLine()) ? false : null;
}
function matchBrackets(cm, autoclear, config) {
// Disable brace matching in long lines, since it'll cause hugely slow updates
var maxHighlightLen = cm.state.matchBrackets.maxHighlightLineLength || 1000;
var marks = [], ranges = cm.listSelections();
for (var i = 0; i < ranges.length; i++) {
var match = ranges[i].empty() && findMatchingBracket(cm, ranges[i].head, config);
if (match && cm.getLine(match.from.line).length <= maxHighlightLen) {
var style = match.match ? "CodeMirror-matchingbracket" : "CodeMirror-nonmatchingbracket";
marks.push(cm.markText(match.from, Pos(match.from.line, match.from.ch + 1), {className: style}));
if (match.to && cm.getLine(match.to.line).length <= maxHighlightLen)
marks.push(cm.markText(match.to, Pos(match.to.line, match.to.ch + 1), {className: style}));
}
}
if (marks.length) {
// Kludge to work around the IE bug from issue #1193, where text
// input stops going to the textare whever this fires.
if (ie_lt8 && cm.state.focused) cm.focus();
var clear = function() {
cm.operation(function() {
for (var i = 0; i < marks.length; i++) marks[i].clear();
});
};
if (autoclear) setTimeout(clear, 800);
else return clear;
}
}
function doMatchBrackets(cm) {
cm.operation(function() {
if (cm.state.matchBrackets.currentlyHighlighted) {
cm.state.matchBrackets.currentlyHighlighted();
cm.state.matchBrackets.currentlyHighlighted = null;
}
cm.state.matchBrackets.currentlyHighlighted = matchBrackets(cm, false, cm.state.matchBrackets);
});
}
CodeMirror.defineOption("matchBrackets", false, function(cm, val, old) {
if (old && old != CodeMirror.Init) {
cm.off("cursorActivity", doMatchBrackets);
if (cm.state.matchBrackets && cm.state.matchBrackets.currentlyHighlighted) {
cm.state.matchBrackets.currentlyHighlighted();
cm.state.matchBrackets.currentlyHighlighted = null;
}
}
if (val) {
cm.state.matchBrackets = typeof val == "object" ? val : {};
cm.on("cursorActivity", doMatchBrackets);
}
});
CodeMirror.defineExtension("matchBrackets", function() {matchBrackets(this, true);});
CodeMirror.defineExtension("findMatchingBracket", function(pos, config, oldConfig){
// Backwards-compatibility kludge
if (oldConfig || typeof config == "boolean") {
if (!oldConfig) {
config = config ? {strict: true} : null
} else {
oldConfig.strict = config
config = oldConfig
}
}
return findMatchingBracket(this, pos, config)
});
CodeMirror.defineExtension("scanForBracket", function(pos, dir, style, config){
return scanForBracket(this, pos, dir, style, config);
});
});

View File

@ -1,399 +0,0 @@
// CodeMirror, copyright (c) by Marijn Haverbeke and others
// Distributed under an MIT license: https://codemirror.net/LICENSE
(function(mod) {
if (typeof exports == "object" && typeof module == "object") // CommonJS
mod(require("../../lib/codemirror"));
else if (typeof define == "function" && define.amd) // AMD
define(["../../lib/codemirror"], mod);
else // Plain browser env
mod(CodeMirror);
})(function(CodeMirror) {
"use strict";
function wordRegexp(words) {
return new RegExp("^((" + words.join(")|(") + "))\\b");
}
var wordOperators = wordRegexp(["and", "or", "not", "is"]);
var commonKeywords = ["as", "assert", "break", "class", "continue",
"def", "del", "elif", "else", "except", "finally",
"for", "from", "global", "if", "import",
"lambda", "pass", "raise", "return",
"try", "while", "with", "yield", "in"];
var commonBuiltins = ["abs", "all", "any", "bin", "bool", "bytearray", "callable", "chr",
"classmethod", "compile", "complex", "delattr", "dict", "dir", "divmod",
"enumerate", "eval", "filter", "float", "format", "frozenset",
"getattr", "globals", "hasattr", "hash", "help", "hex", "id",
"input", "int", "isinstance", "issubclass", "iter", "len",
"list", "locals", "map", "max", "memoryview", "min", "next",
"object", "oct", "open", "ord", "pow", "property", "range",
"repr", "reversed", "round", "set", "setattr", "slice",
"sorted", "staticmethod", "str", "sum", "super", "tuple",
"type", "vars", "zip", "__import__", "NotImplemented",
"Ellipsis", "__debug__"];
CodeMirror.registerHelper("hintWords", "python", commonKeywords.concat(commonBuiltins));
function top(state) {
return state.scopes[state.scopes.length - 1];
}
CodeMirror.defineMode("python", function(conf, parserConf) {
var ERRORCLASS = "error";
var delimiters = parserConf.delimiters || parserConf.singleDelimiters || /^[\(\)\[\]\{\}@,:`=;\.\\]/;
// (Backwards-compatiblity with old, cumbersome config system)
var operators = [parserConf.singleOperators, parserConf.doubleOperators, parserConf.doubleDelimiters, parserConf.tripleDelimiters,
parserConf.operators || /^([-+*/%\/&|^]=?|[<>=]+|\/\/=?|\*\*=?|!=|[~!@]|\.\.\.)/]
for (var i = 0; i < operators.length; i++) if (!operators[i]) operators.splice(i--, 1)
var hangingIndent = parserConf.hangingIndent || conf.indentUnit;
var myKeywords = commonKeywords, myBuiltins = commonBuiltins;
if (parserConf.extra_keywords != undefined)
myKeywords = myKeywords.concat(parserConf.extra_keywords);
if (parserConf.extra_builtins != undefined)
myBuiltins = myBuiltins.concat(parserConf.extra_builtins);
var py3 = !(parserConf.version && Number(parserConf.version) < 3)
if (py3) {
// since http://legacy.python.org/dev/peps/pep-0465/ @ is also an operator
var identifiers = parserConf.identifiers|| /^[_A-Za-z\u00A1-\uFFFF][_A-Za-z0-9\u00A1-\uFFFF]*/;
myKeywords = myKeywords.concat(["nonlocal", "False", "True", "None", "async", "await"]);
myBuiltins = myBuiltins.concat(["ascii", "bytes", "exec", "print"]);
var stringPrefixes = new RegExp("^(([rbuf]|(br)|(fr))?('{3}|\"{3}|['\"]))", "i");
} else {
var identifiers = parserConf.identifiers|| /^[_A-Za-z][_A-Za-z0-9]*/;
myKeywords = myKeywords.concat(["exec", "print"]);
myBuiltins = myBuiltins.concat(["apply", "basestring", "buffer", "cmp", "coerce", "execfile",
"file", "intern", "long", "raw_input", "reduce", "reload",
"unichr", "unicode", "xrange", "False", "True", "None"]);
var stringPrefixes = new RegExp("^(([rubf]|(ur)|(br))?('{3}|\"{3}|['\"]))", "i");
}
var keywords = wordRegexp(myKeywords);
var builtins = wordRegexp(myBuiltins);
// tokenizers
function tokenBase(stream, state) {
var sol = stream.sol() && state.lastToken != "\\"
if (sol) state.indent = stream.indentation()
// Handle scope changes
if (sol && top(state).type == "py") {
var scopeOffset = top(state).offset;
if (stream.eatSpace()) {
var lineOffset = stream.indentation();
if (lineOffset > scopeOffset)
pushPyScope(state);
else if (lineOffset < scopeOffset && dedent(stream, state) && stream.peek() != "#")
state.errorToken = true;
return null;
} else {
var style = tokenBaseInner(stream, state);
if (scopeOffset > 0 && dedent(stream, state))
style += " " + ERRORCLASS;
return style;
}
}
return tokenBaseInner(stream, state);
}
function tokenBaseInner(stream, state) {
if (stream.eatSpace()) return null;
// Handle Comments
if (stream.match(/^#.*/)) return "comment";
// Handle Number Literals
if (stream.match(/^[0-9\.]/, false)) {
var floatLiteral = false;
// Floats
if (stream.match(/^[\d_]*\.\d+(e[\+\-]?\d+)?/i)) { floatLiteral = true; }
if (stream.match(/^[\d_]+\.\d*/)) { floatLiteral = true; }
if (stream.match(/^\.\d+/)) { floatLiteral = true; }
if (floatLiteral) {
// Float literals may be "imaginary"
stream.eat(/J/i);
return "number";
}
// Integers
var intLiteral = false;
// Hex
if (stream.match(/^0x[0-9a-f_]+/i)) intLiteral = true;
// Binary
if (stream.match(/^0b[01_]+/i)) intLiteral = true;
// Octal
if (stream.match(/^0o[0-7_]+/i)) intLiteral = true;
// Decimal
if (stream.match(/^[1-9][\d_]*(e[\+\-]?[\d_]+)?/)) {
// Decimal literals may be "imaginary"
stream.eat(/J/i);
// TODO - Can you have imaginary longs?
intLiteral = true;
}
// Zero by itself with no other piece of number.
if (stream.match(/^0(?![\dx])/i)) intLiteral = true;
if (intLiteral) {
// Integer literals may be "long"
stream.eat(/L/i);
return "number";
}
}
// Handle Strings
if (stream.match(stringPrefixes)) {
var isFmtString = stream.current().toLowerCase().indexOf('f') !== -1;
if (!isFmtString) {
state.tokenize = tokenStringFactory(stream.current(), state.tokenize);
return state.tokenize(stream, state);
} else {
state.tokenize = formatStringFactory(stream.current(), state.tokenize);
return state.tokenize(stream, state);
}
}
for (var i = 0; i < operators.length; i++)
if (stream.match(operators[i])) return "operator"
if (stream.match(delimiters)) return "punctuation";
if (state.lastToken == "." && stream.match(identifiers))
return "property";
if (stream.match(keywords) || stream.match(wordOperators))
return "keyword";
if (stream.match(builtins))
return "builtin";
if (stream.match(/^(self|cls)\b/))
return "variable-2";
if (stream.match(identifiers)) {
if (state.lastToken == "def" || state.lastToken == "class")
return "def";
return "variable";
}
// Handle non-detected items
stream.next();
return ERRORCLASS;
}
function formatStringFactory(delimiter, tokenOuter) {
while ("rubf".indexOf(delimiter.charAt(0).toLowerCase()) >= 0)
delimiter = delimiter.substr(1);
var singleline = delimiter.length == 1;
var OUTCLASS = "string";
function tokenNestedExpr(depth) {
return function(stream, state) {
var inner = tokenBaseInner(stream, state)
if (inner == "punctuation") {
if (stream.current() == "{") {
state.tokenize = tokenNestedExpr(depth + 1)
} else if (stream.current() == "}") {
if (depth > 1) state.tokenize = tokenNestedExpr(depth - 1)
else state.tokenize = tokenString
}
}
return inner
}
}
function tokenString(stream, state) {
while (!stream.eol()) {
stream.eatWhile(/[^'"\{\}\\]/);
if (stream.eat("\\")) {
stream.next();
if (singleline && stream.eol())
return OUTCLASS;
} else if (stream.match(delimiter)) {
state.tokenize = tokenOuter;
return OUTCLASS;
} else if (stream.match('{{')) {
// ignore {{ in f-str
return OUTCLASS;
} else if (stream.match('{', false)) {
// switch to nested mode
state.tokenize = tokenNestedExpr(0)
if (stream.current()) return OUTCLASS;
else return state.tokenize(stream, state)
} else if (stream.match('}}')) {
return OUTCLASS;
} else if (stream.match('}')) {
// single } in f-string is an error
return ERRORCLASS;
} else {
stream.eat(/['"]/);
}
}
if (singleline) {
if (parserConf.singleLineStringErrors)
return ERRORCLASS;
else
state.tokenize = tokenOuter;
}
return OUTCLASS;
}
tokenString.isString = true;
return tokenString;
}
function tokenStringFactory(delimiter, tokenOuter) {
while ("rubf".indexOf(delimiter.charAt(0).toLowerCase()) >= 0)
delimiter = delimiter.substr(1);
var singleline = delimiter.length == 1;
var OUTCLASS = "string";
function tokenString(stream, state) {
while (!stream.eol()) {
stream.eatWhile(/[^'"\\]/);
if (stream.eat("\\")) {
stream.next();
if (singleline && stream.eol())
return OUTCLASS;
} else if (stream.match(delimiter)) {
state.tokenize = tokenOuter;
return OUTCLASS;
} else {
stream.eat(/['"]/);
}
}
if (singleline) {
if (parserConf.singleLineStringErrors)
return ERRORCLASS;
else
state.tokenize = tokenOuter;
}
return OUTCLASS;
}
tokenString.isString = true;
return tokenString;
}
function pushPyScope(state) {
while (top(state).type != "py") state.scopes.pop()
state.scopes.push({offset: top(state).offset + conf.indentUnit,
type: "py",
align: null})
}
function pushBracketScope(stream, state, type) {
var align = stream.match(/^([\s\[\{\(]|#.*)*$/, false) ? null : stream.column() + 1
state.scopes.push({offset: state.indent + hangingIndent,
type: type,
align: align})
}
function dedent(stream, state) {
var indented = stream.indentation();
while (state.scopes.length > 1 && top(state).offset > indented) {
if (top(state).type != "py") return true;
state.scopes.pop();
}
return top(state).offset != indented;
}
function tokenLexer(stream, state) {
if (stream.sol()) state.beginningOfLine = true;
var style = state.tokenize(stream, state);
var current = stream.current();
// Handle decorators
if (state.beginningOfLine && current == "@")
return stream.match(identifiers, false) ? "meta" : py3 ? "operator" : ERRORCLASS;
if (/\S/.test(current)) state.beginningOfLine = false;
if ((style == "variable" || style == "builtin")
&& state.lastToken == "meta")
style = "meta";
// Handle scope changes.
if (current == "pass" || current == "return")
state.dedent += 1;
if (current == "lambda") state.lambda = true;
if (current == ":" && !state.lambda && top(state).type == "py")
pushPyScope(state);
if (current.length == 1 && !/string|comment/.test(style)) {
var delimiter_index = "[({".indexOf(current);
if (delimiter_index != -1)
pushBracketScope(stream, state, "])}".slice(delimiter_index, delimiter_index+1));
delimiter_index = "])}".indexOf(current);
if (delimiter_index != -1) {
if (top(state).type == current) state.indent = state.scopes.pop().offset - hangingIndent
else return ERRORCLASS;
}
}
if (state.dedent > 0 && stream.eol() && top(state).type == "py") {
if (state.scopes.length > 1) state.scopes.pop();
state.dedent -= 1;
}
return style;
}
var external = {
startState: function(basecolumn) {
return {
tokenize: tokenBase,
scopes: [{offset: basecolumn || 0, type: "py", align: null}],
indent: basecolumn || 0,
lastToken: null,
lambda: false,
dedent: 0
};
},
token: function(stream, state) {
var addErr = state.errorToken;
if (addErr) state.errorToken = false;
var style = tokenLexer(stream, state);
if (style && style != "comment")
state.lastToken = (style == "keyword" || style == "punctuation") ? stream.current() : style;
if (style == "punctuation") style = null;
if (stream.eol() && state.lambda)
state.lambda = false;
return addErr ? style + " " + ERRORCLASS : style;
},
indent: function(state, textAfter) {
if (state.tokenize != tokenBase)
return state.tokenize.isString ? CodeMirror.Pass : 0;
var scope = top(state), closing = scope.type == textAfter.charAt(0)
if (scope.align != null)
return scope.align - (closing ? 1 : 0)
else
return scope.offset - (closing ? hangingIndent : 0)
},
electricInput: /^\s*[\}\]\)]$/,
closeBrackets: {triples: "'\""},
lineComment: "#",
fold: "indent"
};
return external;
});
CodeMirror.defineMIME("text/x-python", "python");
var words = function(str) { return str.split(" "); };
CodeMirror.defineMIME("text/x-cython", {
name: "python",
extra_keywords: words("by cdef cimport cpdef ctypedef enum except "+
"extern gil include nogil property public "+
"readonly struct union DEF IF ELIF ELSE")
});
});

View File

@ -1,365 +0,0 @@
.pywebio {
min-height: 100vh;
padding-top: 20px;
padding-bottom: 1px; /* if set 0, safari has min-height issue */
}
.markdown-body hr {
height: 2px;
padding: 0;
margin: 24px 0;
background-color: #eaecef;
border: 0;
}
.container {
margin-top: 0;
max-width: 880px;
}
#input-cards{
max-width: 880px;
}
#output-container {
-webkit-font-smoothing: antialiased;
margin-bottom: 20px;
}
#input-container h5.card-header {
padding: .4rem 1.25rem;
}
#input-container {
z-index: 100;
background: white;
position: static;
height: fit-content;
box-shadow: none;
margin-top: 0;
margin-bottom: 40px; /* must equal #input-container.fixed padding-bottom */
}
#input-container.fixed {
position: fixed !important;
overflow: visible;
bottom: 0;
left: 0;
right: 0;
height: 0;
box-shadow: 0 0 12px 1px rgba(80, 80, 80, 0.2);
margin-bottom: 0;
padding: 40px 0; /* must equal #input-container margin-bottom */
}
#input-container .card {
margin: 0 auto;
}
.footer {
height: 50px;
text-align: center;
color: gray;
background-color: #efefef;
line-height: 50px; /*设置line-height与父级元素的height相等, 以实现垂直居中*/
margin: 0 auto;
}
.footer hr {
margin-bottom: 0.5rem;
}
.footer a:visited {
color: #9B59B6;
}
.footer a {
color: #2980B9;
text-decoration: none;
cursor: pointer;
}
#title {
text-align: center;
position: absolute;
left: 50%;
transform: translateX(-50%);
white-space: nowrap;
}
/* Amend markdown style */
.markdown-body {
font-family: inherit;
color: inherit;
}
.markdown-body blockquote, .markdown-body dl, .markdown-body ol, .markdown-body p, .markdown-body pre, .markdown-body table, .markdown-body ul, .markdown-body details {
margin-bottom: 10px;
}
.CodeMirror {
font-size: 13px
}
h5.card-header:empty {
padding: 0 !important;
border-bottom: 0 !important;
}
button {
margin-bottom: 8px;
}
td blockquote, td dl, td ol, td p, td pre, td table, td ul, td button, td pre {
margin-bottom: 0 !important;
}
.input-container .form-group {
margin-bottom: 0;
}
img {
-webkit-animation-name: image-load-in;
animation-name: image-load-in;
-webkit-animation-duration: .6s;
animation-duration: .6s
}
@-webkit-keyframes image-load-in {
0% {
-webkit-filter: blur(8px);
filter: blur(8px);
opacity: 0
}
100% {
-webkit-filter: blur(0);
filter: blur(0);
opacity: 1
}
}
@keyframes image-load-in {
0% {
-webkit-filter: blur(8px);
filter: blur(8px);
opacity: 0
}
100% {
-webkit-filter: blur(0);
filter: blur(0);
opacity: 1
}
}
.custom-file {
margin-bottom: 8px;
}
summary:focus {
outline: none;
}
details {
border: 1px solid rgba(0,0,0,.125);
border-radius: 4px;
padding: .5em .5em 0;
}
summary {
/*font-weight: bold;*/
margin: -.5em -.5em 0;
padding: .5em;
background-color: rgba(0,0,0,.03);
}
details[open] {
padding: .5em;
}
details[open]>summary {
border-bottom: 1px solid rgba(0,0,0,.125);
margin-bottom: .5em;
}
.actions-result {
display: inline-block;
font-weight: 400;
color: #212529;
text-align: center;
vertical-align: middle;
padding: .375rem .75rem;
font-size: 1rem;
line-height: 1.5;
margin-bottom: 8px;
}
.hide{
display: none;
}
.single-input-action-btn{
border-top-right-radius: 0.25rem!important;
border-bottom-right-radius: 0.25rem!important;
}
.alert-success > a {
color: #0b2e13;
}
.alert-danger > a {
color: #491217;
}
.alert-warning > a {
color: #533f03;
}
.alert-info > a {
color: #062c33;
}
.alert > a {
font-weight: 700;
}
/*Tabs widget style*/
/*Credit: https://themes.gohugo.io/theme/hugo-book/docs/shortcodes/tabs/ Licensed by MIT */
.webio-tabs {
margin-top: 1rem;
margin-bottom: 1rem;
border: 1px solid #e9ecef;
border-radius: .25rem;
overflow: hidden;
display: flex;
flex-wrap: wrap;
align-content: flex-start;
}
.webio-tabs > label {
display: inline-block;
padding: .5rem 1rem;
border-bottom: 1px transparent;
cursor: pointer;
margin-bottom: 0!important;
}
.webio-tabs > label:hover {
background-color: #e9ecef;
}
.webio-tabs > .webio-tabs-content {
order: 999;
width: 100%;
border-top: 1px solid #f5f5f5;
padding: 1rem;
display: none
}
.webio-tabs > input[type=radio]:checked + label {
border-bottom: 2px solid #0055bb
}
.webio-tabs > input[type=radio]:checked + label + .webio-tabs-content {
display: block
}
.webio-tabs > input.toggle {
height: 0;
width: 0;
overflow: hidden;
opacity: 0;
position: absolute;
}
.webio-tabs > [type=radio] {
box-sizing: border-box;
padding: 0;
}
/*End of Tabs widget*/
.custom-file-label > span {
overflow: hidden;
width: calc(100% - 6ch - 0.75rem) !important; /* 6ch is for "Browse", 0.75rem is for the padding of "Browse" button */
text-overflow: ellipsis;
white-space: nowrap;
display: inline-block !important;
}
.CodeMirror-fullscreen {
position: fixed;
top: 0; left: 0; right: 0; bottom: 0;
height: auto;
z-index: 9;
}
.pywebio-clickable{
cursor: pointer;
}
/* scrollable widget */
.webio-scrollable{
overflow: auto;
margin-bottom: 10px;
}
.webio-scrollable.scrollable-border{
padding: 10px;
border: 1px solid rgba(0,0,0,.125);
box-shadow: inset 0 0 2px 0 rgba(0,0,0,.1);
}
/* dark theme */
.webio-theme-dark #input-container {
background: #0d1117;
}
.webio-theme-dark #input-container.fixed {
box-shadow: 0px 0px 8px 5px rgb(0 0 0);
}
.webio-theme-dark .footer {
background-color: #0d1117;
border-top: 1px solid #30363d;
}
.webio-theme-dark .webio-tabs {
border: 1px solid #343a40;
}
.webio-theme-dark .webio-tabs > .webio-tabs-content {
border-top: 1px solid #343a40;
}
.webio-theme-dark .webio-tabs > label:hover {
background-color: #000;
}
.webio-theme-dark .webio-tabs > input[type=radio]:checked + label {
border-bottom: 2px solid #0040a1;
}
.webio-theme-dark .scrollable-border{
border: 1px solid #343a40;
}
.webio-theme-dark details{
border: 1px solid #343a40;
}
.webio-theme-dark details>summary{
background-color: #191d21;
}
.webio-theme-dark details[open]>summary{
border-bottom: 1px solid #343a40;
}
/* dark theme end */
/* For range input */
.form-control-range {
display:inline;
width:calc(100% - 40px);
}
.form-control-range-value{
display: inline-block;
max-width: 35px;
white-space: nowrap;
color: #6c757d;
line-height: 14px;
vertical-align: text-top;
}

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View File

@ -1,15 +0,0 @@
/**
* Minified by jsDelivr using clean-css v4.2.3.
* Original file: /npm/toastify-js@1.9.3/src/toastify.css
*
* Do NOT use SRI with dynamically generated files! More information: https://www.jsdelivr.com/using-sri-with-dynamic-files
*/
/*!
* Toastify js 1.9.3
* https://github.com/apvarun/toastify-js
* @license MIT licensed
*
* Copyright (C) 2018 Varun A P
*/
.toastify{padding:12px 20px;color:#fff;display:inline-block;box-shadow:0 3px 6px -1px rgba(0,0,0,.12),0 10px 36px -4px rgba(77,96,232,.3);background:-webkit-linear-gradient(315deg,#73a5ff,#5477f5);background:linear-gradient(135deg,#73a5ff,#5477f5);position:fixed;opacity:0;transition:all .4s cubic-bezier(.215,.61,.355,1);border-radius:2px;cursor:pointer;text-decoration:none;max-width:calc(50% - 20px);z-index:2147483647}.toastify.on{opacity:1}.toast-close{opacity:.4;padding:0 5px}.toastify-right{right:15px}.toastify-left{left:15px}.toastify-top{top:-150px}.toastify-bottom{bottom:-150px}.toastify-rounded{border-radius:25px}.toastify-avatar{width:1.5em;height:1.5em;margin:-7px 5px;border-radius:2px}.toastify-center{margin-left:auto;margin-right:auto;left:0;right:0;max-width:fit-content;max-width:-moz-fit-content}@media only screen and (max-width:360px){.toastify-left,.toastify-right{margin-left:auto;margin-right:auto;left:0;right:0;max-width:fit-content}}
/*# sourceMappingURL=/sm/40f738e33ed5dbe7907b48c3be4b63e977eab6cb49c8df4f76f3edc3f1f2fb0d.map */

Binary file not shown.

Before

(image error) Size: 513 B

Binary file not shown.

Before

(image error) Size: 251 B

View File

@ -1,3 +0,0 @@
(function(a,b){if("function"==typeof define&&define.amd)define([],b);else if("undefined"!=typeof exports)b();else{b(),a.FileSaver={exports:{}}.exports}})(this,function(){"use strict";function b(a,b){return"undefined"==typeof b?b={autoBom:!1}:"object"!=typeof b&&(console.warn("Deprecated: Expected third argument to be a object"),b={autoBom:!b}),b.autoBom&&/^\s*(?:text\/\S*|application\/xml|\S*\/\S*\+xml)\s*;.*charset\s*=\s*utf-8/i.test(a.type)?new Blob(["\uFEFF",a],{type:a.type}):a}function c(b,c,d){var e=new XMLHttpRequest;e.open("GET",b),e.responseType="blob",e.onload=function(){a(e.response,c,d)},e.onerror=function(){console.error("could not download file")},e.send()}function d(a){var b=new XMLHttpRequest;b.open("HEAD",a,!1);try{b.send()}catch(a){}return 200<=b.status&&299>=b.status}function e(a){try{a.dispatchEvent(new MouseEvent("click"))}catch(c){var b=document.createEvent("MouseEvents");b.initMouseEvent("click",!0,!0,window,0,0,0,80,20,!1,!1,!1,!1,0,null),a.dispatchEvent(b)}}var f="object"==typeof window&&window.window===window?window:"object"==typeof self&&self.self===self?self:"object"==typeof global&&global.global===global?global:void 0,a=f.saveAs||("object"!=typeof window||window!==f?function(){}:"download"in HTMLAnchorElement.prototype?function(b,g,h){var i=f.URL||f.webkitURL,j=document.createElement("a");g=g||b.name||"download",j.download=g,j.rel="noopener","string"==typeof b?(j.href=b,j.origin===location.origin?e(j):d(j.href)?c(b,g,h):e(j,j.target="_blank")):(j.href=i.createObjectURL(b),setTimeout(function(){i.revokeObjectURL(j.href)},4E4),setTimeout(function(){e(j)},0))}:"msSaveOrOpenBlob"in navigator?function(f,g,h){if(g=g||f.name||"download","string"!=typeof f)navigator.msSaveOrOpenBlob(b(f,h),g);else if(d(f))c(f,g,h);else{var i=document.createElement("a");i.href=f,i.target="_blank",setTimeout(function(){e(i)})}}:function(a,b,d,e){if(e=e||open("","_blank"),e&&(e.document.title=e.document.body.innerText="downloading..."),"string"==typeof a)return c(a,b,d);var g="application/octet-stream"===a.type,h=/constructor/i.test(f.HTMLElement)||f.safari,i=/CriOS\/[\d]+/.test(navigator.userAgent);if((i||g&&h)&&"undefined"!=typeof FileReader){var j=new FileReader;j.onloadend=function(){var a=j.result;a=i?a:a.replace(/^data:[^;]*;/,"data:attachment/file;"),e?e.location.href=a:location=a,e=null},j.readAsDataURL(a)}else{var k=f.URL||f.webkitURL,l=k.createObjectURL(a);e?e.location=l:location.href=l,e=null,setTimeout(function(){k.revokeObjectURL(l)},4E4)}});f.saveAs=a.saveAs=a,"undefined"!=typeof module&&(module.exports=a)});
//# sourceMappingURL=FileSaver.min.js.map

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View File

@ -1,7 +0,0 @@
/*!
* bsCustomFileInput v1.3.4 (https://github.com/Johann-S/bs-custom-file-input)
* Copyright 2018 - 2020 Johann-S <johann.servoire@gmail.com>
* Licensed under MIT (https://github.com/Johann-S/bs-custom-file-input/blob/master/LICENSE)
*/
!function(e,t){"object"==typeof exports&&"undefined"!=typeof module?module.exports=t():"function"==typeof define&&define.amd?define(t):(e=e||self).bsCustomFileInput=t()}(this,function(){"use strict";var s={CUSTOMFILE:'.custom-file input[type="file"]',CUSTOMFILELABEL:".custom-file-label",FORM:"form",INPUT:"input"},l=function(e){if(0<e.childNodes.length)for(var t=[].slice.call(e.childNodes),n=0;n<t.length;n++){var l=t[n];if(3!==l.nodeType)return l}return e},u=function(e){var t=e.bsCustomFileInput.defaultText,n=e.parentNode.querySelector(s.CUSTOMFILELABEL);n&&(l(n).textContent=t)},n=!!window.File,r=function(e){if(e.hasAttribute("multiple")&&n)return[].slice.call(e.files).map(function(e){return e.name}).join(", ");if(-1===e.value.indexOf("fakepath"))return e.value;var t=e.value.split("\\");return t[t.length-1]};function d(){var e=this.parentNode.querySelector(s.CUSTOMFILELABEL);if(e){var t=l(e),n=r(this);n.length?t.textContent=n:u(this)}}function v(){for(var e=[].slice.call(this.querySelectorAll(s.INPUT)).filter(function(e){return!!e.bsCustomFileInput}),t=0,n=e.length;t<n;t++)u(e[t])}var p="bsCustomFileInput",m="reset",h="change";return{init:function(e,t){void 0===e&&(e=s.CUSTOMFILE),void 0===t&&(t=s.FORM);for(var n,l,r=[].slice.call(document.querySelectorAll(e)),i=[].slice.call(document.querySelectorAll(t)),o=0,u=r.length;o<u;o++){var c=r[o];Object.defineProperty(c,p,{value:{defaultText:(n=void 0,n="",(l=c.parentNode.querySelector(s.CUSTOMFILELABEL))&&(n=l.textContent),n)},writable:!0}),d.call(c),c.addEventListener(h,d)}for(var f=0,a=i.length;f<a;f++)i[f].addEventListener(m,v),Object.defineProperty(i[f],p,{value:!0,writable:!0})},destroy:function(){for(var e=[].slice.call(document.querySelectorAll(s.FORM)).filter(function(e){return!!e.bsCustomFileInput}),t=[].slice.call(document.querySelectorAll(s.INPUT)).filter(function(e){return!!e.bsCustomFileInput}),n=0,l=t.length;n<l;n++){var r=t[n];u(r),r[p]=void 0,r.removeEventListener(h,d)}for(var i=0,o=e.length;i<o;i++)e[i].removeEventListener(m,v),e[i][p]=void 0}}});
//# sourceMappingURL=bs-custom-file-input.min.js.map

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View File

@ -1,15 +0,0 @@
/**
* Minified by jsDelivr using Terser v5.3.0.
* Original file: /npm/toastify-js@1.9.3/src/toastify.js
*
* Do NOT use SRI with dynamically generated files! More information: https://www.jsdelivr.com/using-sri-with-dynamic-files
*/
/*!
* Toastify js 1.9.3
* https://github.com/apvarun/toastify-js
* @license MIT licensed
*
* Copyright (C) 2018 Varun A P
*/
!function(t,o){"object"==typeof module&&module.exports?module.exports=o():t.Toastify=o()}(this,(function(t){var o=function(t){return new o.lib.init(t)};function i(t,o){return o.offset[t]?isNaN(o.offset[t])?o.offset[t]:o.offset[t]+"px":"0px"}function s(t,o){return!(!t||"string"!=typeof o)&&!!(t.className&&t.className.trim().split(/\s+/gi).indexOf(o)>-1)}return o.lib=o.prototype={toastify:"1.9.3",constructor:o,init:function(t){return t||(t={}),this.options={},this.toastElement=null,this.options.text=t.text||"Hi there!",this.options.node=t.node,this.options.duration=0===t.duration?0:t.duration||3e3,this.options.selector=t.selector,this.options.callback=t.callback||function(){},this.options.destination=t.destination,this.options.newWindow=t.newWindow||!1,this.options.close=t.close||!1,this.options.gravity="bottom"===t.gravity?"toastify-bottom":"toastify-top",this.options.positionLeft=t.positionLeft||!1,this.options.position=t.position||"",this.options.backgroundColor=t.backgroundColor,this.options.avatar=t.avatar||"",this.options.className=t.className||"",this.options.stopOnFocus=void 0===t.stopOnFocus||t.stopOnFocus,this.options.onClick=t.onClick,this.options.offset=t.offset||{x:0,y:0},this},buildToast:function(){if(!this.options)throw"Toastify is not initialized";var t=document.createElement("div");if(t.className="toastify on "+this.options.className,this.options.position?t.className+=" toastify-"+this.options.position:!0===this.options.positionLeft?(t.className+=" toastify-left",console.warn("Property `positionLeft` will be depreciated in further versions. Please use `position` instead.")):t.className+=" toastify-right",t.className+=" "+this.options.gravity,this.options.backgroundColor&&(t.style.background=this.options.backgroundColor),this.options.node&&this.options.node.nodeType===Node.ELEMENT_NODE)t.appendChild(this.options.node);else if(t.innerHTML=this.options.text,""!==this.options.avatar){var o=document.createElement("img");o.src=this.options.avatar,o.className="toastify-avatar","left"==this.options.position||!0===this.options.positionLeft?t.appendChild(o):t.insertAdjacentElement("afterbegin",o)}if(!0===this.options.close){var s=document.createElement("span");s.innerHTML="&#10006;",s.className="toast-close",s.addEventListener("click",function(t){t.stopPropagation(),this.removeElement(this.toastElement),window.clearTimeout(this.toastElement.timeOutValue)}.bind(this));var n=window.innerWidth>0?window.innerWidth:screen.width;("left"==this.options.position||!0===this.options.positionLeft)&&n>360?t.insertAdjacentElement("afterbegin",s):t.appendChild(s)}if(this.options.stopOnFocus&&this.options.duration>0){var e=this;t.addEventListener("mouseover",(function(o){window.clearTimeout(t.timeOutValue)})),t.addEventListener("mouseleave",(function(){t.timeOutValue=window.setTimeout((function(){e.removeElement(t)}),e.options.duration)}))}if(void 0!==this.options.destination&&t.addEventListener("click",function(t){t.stopPropagation(),!0===this.options.newWindow?window.open(this.options.destination,"_blank"):window.location=this.options.destination}.bind(this)),"function"==typeof this.options.onClick&&void 0===this.options.destination&&t.addEventListener("click",function(t){t.stopPropagation(),this.options.onClick()}.bind(this)),"object"==typeof this.options.offset){var a=i("x",this.options),p=i("y",this.options),r="left"==this.options.position?a:"-"+a,l="toastify-top"==this.options.gravity?p:"-"+p;t.style.transform="translate("+r+","+l+")"}return t},showToast:function(){var t;if(this.toastElement=this.buildToast(),!(t=void 0===this.options.selector?document.body:document.getElementById(this.options.selector)))throw"Root element is not defined";return t.insertBefore(this.toastElement,t.firstChild),o.reposition(),this.options.duration>0&&(this.toastElement.timeOutValue=window.setTimeout(function(){this.removeElement(this.toastElement)}.bind(this),this.options.duration)),this},hideToast:function(){this.toastElement.timeOutValue&&clearTimeout(this.toastElement.timeOutValue),this.removeElement(this.toastElement)},removeElement:function(t){t.className=t.className.replace(" on",""),window.setTimeout(function(){this.options.node&&this.options.node.parentNode&&this.options.node.parentNode.removeChild(this.options.node),t.parentNode&&t.parentNode.removeChild(t),this.options.callback.call(t),o.reposition()}.bind(this),400)}},o.reposition=function(){for(var t,o={top:15,bottom:15},i={top:15,bottom:15},n={top:15,bottom:15},e=document.getElementsByClassName("toastify"),a=0;a<e.length;a++){t=!0===s(e[a],"toastify-top")?"toastify-top":"toastify-bottom";var p=e[a].offsetHeight;t=t.substr(9,t.length-1);(window.innerWidth>0?window.innerWidth:screen.width)<=360?(e[a].style[t]=n[t]+"px",n[t]+=p+15):!0===s(e[a],"toastify-left")?(e[a].style[t]=o[t]+"px",o[t]+=p+15):(e[a].style[t]=i[t]+"px",i[t]+=p+15)}return this},o.lib.init.prototype=o.lib,o}));
//# sourceMappingURL=/sm/b2692762f762f02a544ce708819ce22427514c155203d0627f14174806ee9f38.map

View File

@ -1,863 +0,0 @@
#!/usr/bin/env python
# -*- encoding: utf-8 -*-
# @Author: https://github.com/Evil0ctal/
# @Time: 2021/11/06
# @Update: 2024/03/25
# @Version: 3.1.8
# @Function:
# 创建一个接受提交参数的FastAPi应用程序。
# 将scraper.py返回的内容以JSON格式返回。
import os
import time
import json
import aiohttp
import uvicorn
import zipfile
import threading
import yaml
from fastapi import FastAPI, Request
from fastapi.responses import ORJSONResponse, FileResponse
from slowapi import Limiter, _rate_limit_exceeded_handler
from slowapi.errors import RateLimitExceeded
from slowapi.util import get_remote_address
from pydantic import BaseModel
from starlette.responses import RedirectResponse
from scraper import Scraper
# 读取配置文件
with open('config.yml', 'r', encoding='utf-8') as yaml_file:
config = yaml.safe_load(yaml_file)
# 运行端口
port = int(config["Web_API"]["Port"])
# 域名
domain = config["Web_API"]["Domain"]
# 限制器/Limiter
Rate_Limit = config["Web_API"]["Rate_Limit"]
# 创建FastAPI实例
title = "Douyin TikTok Download/Scraper API-V1"
version = '3.1.8'
update_time = "2023/09/25"
description = """
#### Description/说明
<details>
<summary>点击展开/Click to expand</summary>
> [中文/Chinese]
- 爬取Douyin以及TikTok的数据并返回更多功能正在开发中
- 如果需要更多接口请查看[https://api.tikhub.io/docs](https://api.tikhub.io/docs)
- 本项目开源在[GitHubDouyin_TikTok_Download_API](https://github.com/Evil0ctal/Douyin_TikTok_Download_API)
- 全部端点数据均来自抖音以及TikTok的官方接口如遇到问题或BUG或建议请在[issues](https://github.com/Evil0ctal/Douyin_TikTok_Download_API/issues)中反馈
- 本项目仅供学习交流使用严禁用于违法用途如有侵权请联系作者
> [英文/English]
- Crawl the data of Douyin and TikTok and return it. More features are under development.
- If you need more interfaces, please visit [https://api.tikhub.io/docs](https://api.tikhub.io/docs).
- This project is open source on [GitHub: Douyin_TikTok_Download_API](https://github.com/Evil0ctal/Douyin_TikTok_Download_API).
- All endpoint data comes from the official interface of Douyin and TikTok. If you have any questions or BUGs or suggestions, please feedback in [issues](
- This project is for learning and communication only. It is strictly forbidden to be used for illegal purposes. If there is any infringement, please contact the author.
</details>
#### Contact author/联系作者
<details>
<summary>点击展开/Click to expand</summary>
- WeChat: Evil0ctal
- Email: [Evil0ctal1985@gmail.com](mailto:Evil0ctal1985@gmail.com)
- Github: [https://github.com/Evil0ctal](https://github.com/Evil0ctal)
</details>
"""
tags_metadata = [
{
"name": "Root",
"description": "Root path info.",
},
{
"name": "API",
"description": "Hybrid interface, automatically determine the input link and return the simplified data/混合接口,自动判断输入链接返回精简后的数据。",
},
{
"name": "Douyin",
"description": "All Douyin API Endpoints/所有抖音接口节点",
},
{
"name": "TikTok",
"description": "All TikTok API Endpoints/所有TikTok接口节点",
},
{
"name": "Download",
"description": "Enter the share link and return the download file response./输入分享链接后返回下载文件响应",
},
{
"name": "iOS_Shortcut",
"description": "Get iOS shortcut info/获取iOS快捷指令信息",
},
]
# 创建Scraper对象
api = Scraper()
# 创建FastAPI实例
app = FastAPI(
title=title,
description=description,
version=version,
openapi_tags=tags_metadata
)
# 创建Limiter对象
limiter = Limiter(key_func=get_remote_address)
app.state.limiter = limiter
app.add_exception_handler(RateLimitExceeded, _rate_limit_exceeded_handler)
""" ________________________⬇端点响应模型(Endpoints Response Model)⬇________________________"""
# API Root节点
class APIRoot(BaseModel):
API_status: str
Version: str = version
Update_time: str = update_time
Request_Rate_Limit: str = Rate_Limit
Web_APP: str
API_V1_Document: str
TikHub_API_Document: str
GitHub: str
# API获取视频基础模型
class iOS_Shortcut(BaseModel):
version: str = None
update: str = None
link: str = None
link_en: str = None
note: str = None
note_en: str = None
# API获取视频基础模型
class API_Video_Response(BaseModel):
status: str = None
platform: str = None
endpoint: str = None
message: str = None
total_time: float = None
aweme_list: list = None
# 混合解析API基础模型:
class API_Hybrid_Response(BaseModel):
status: str = None
message: str = None
endpoint: str = None
url: str = None
type: str = None
platform: str = None
aweme_id: str = None
total_time: float = None
official_api_url: dict = None
desc: str = None
create_time: int = None
author: dict = None
music: dict = None
statistics: dict = None
cover_data: dict = None
hashtags: list = None
video_data: dict = None
image_data: dict = None
# 混合解析API精简版基础模型:
class API_Hybrid_Minimal_Response(BaseModel):
status: str = None
message: str = None
platform: str = None
type: str = None
wm_video_url: str = None
wm_video_url_HQ: str = None
nwm_video_url: str = None
nwm_video_url_HQ: str = None
no_watermark_image_list: list or None = None
watermark_image_list: list or None = None
""" ________________________⬇端点日志记录(Endpoint logs)⬇________________________"""
# 记录API请求日志
async def api_logs(start_time, input_data, endpoint, error_data: dict = None):
if config["Web_API"]["Allow_Logs"]:
time_now = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime())
total_time = float(format(time.time() - start_time, '.4f'))
file_name = "API_logs.json"
# 写入日志内容
with open(file_name, "a", encoding="utf-8") as f:
data = {
"time": time_now,
"endpoint": f'/{endpoint}/',
"total_time": total_time,
"input_data": input_data,
"error_data": error_data if error_data else "No error"
}
f.write(json.dumps(data, ensure_ascii=False) + ",\n")
print('日志记录成功!')
return 1
else:
print('日志记录已关闭!')
return 0
""" ________________________⬇Root端点(Root endpoint)⬇________________________"""
# Root端点
@app.get("/", response_class=ORJSONResponse, response_model=APIRoot, tags=["Root"])
async def root():
"""
Root path info.
"""
data = {
"API_status": "Running",
"Version": version,
"Update_time": update_time,
"Request_Rate_Limit": Rate_Limit,
"Web_APP": "https://www.douyin.wtf/",
"API_V1_Document": "https://api.douyin.wtf/docs",
"TikHub_API_Document": "https://api.tikhub.io/docs",
"GitHub": "https://github.com/Evil0ctal/Douyin_TikTok_Download_API",
}
return ORJSONResponse(data)
""" ________________________⬇混合解析端点(Hybrid parsing endpoints)⬇________________________"""
# 混合解析端点,自动判断输入链接返回精简后的数据
# Hybrid parsing endpoint, automatically determine the input link and return the simplified data.
@app.get("/api", tags=["API"], response_class=ORJSONResponse, response_model=API_Hybrid_Response)
@limiter.limit(Rate_Limit)
async def hybrid_parsing(request: Request, url: str, minimal: bool = False):
"""
## 用途/Usage
- 获取[抖音|TikTok]单个视频数据参数是视频链接或分享口令
- Get [Douyin|TikTok] single video data, the parameter is the video link or share code.
## 参数/Parameter
#### url(必填/Required)):
- 视频链接| 分享口令
- The video link.| Share code
- 例子/Example:
`https://www.douyin.com/video/7153585499477757192`
`https://v.douyin.com/MkmSwy7/`
`https://vm.tiktok.com/TTPdkQvKjP/`
`https://www.tiktok.com/@tvamii/video/7045537727743380782`
#### minimal(选填/Optional Default:False):
- 是否返回精简版数据
- Whether to return simplified data.
- 例子/Example:
`True`
`False`
## 返回值/Return
- 用户当个视频数据的列表列表内包含JSON数据
- List of user single video data, list contains JSON data.
"""
print("正在进行混合解析...")
# 开始时间
start_time = time.time()
# 获取数据
data = await api.hybrid_parsing(url)
# 是否精简
if minimal:
result = api.hybrid_parsing_minimal(data)
else:
# 更新数据
result = {
'url': url,
"endpoint": "/api/",
"total_time": float(format(time.time() - start_time, '.4f')),
}
# 合并数据
result.update(data)
# 记录API调用
await api_logs(start_time=start_time,
input_data={'url': url},
endpoint='api')
return ORJSONResponse(result)
""" ________________________⬇抖音视频解析端点(Douyin video parsing endpoint)⬇________________________"""
# 获取抖音单个视频数据/Get Douyin single video data
@app.get("/douyin_video_data/", response_class=ORJSONResponse, response_model=API_Video_Response, tags=["Douyin"])
@limiter.limit(Rate_Limit)
async def get_douyin_video_data(request: Request, douyin_video_url: str = None, video_id: str = None):
"""
## 用途/Usage
- 获取抖音用户单个视频数据参数是视频链接|分享口令
- Get the data of a single video of a Douyin user, the parameter is the video link.
## 参数/Parameter
#### douyin_video_url(选填/Optional):
- 视频链接| 分享口令
- The video link.| Share code
- 例子/Example:
`https://www.douyin.com/video/7153585499477757192`
`https://v.douyin.com/MkmSwy7/`
#### video_id(选填/Optional):
- 视频ID可以从视频链接中获取
- The video ID, can be obtained from the video link.
- 例子/Example:
`7153585499477757192`
#### s_v_web_id(选填/Optional):
- s_v_web_id可以从浏览器访问抖音然后从cookie中获取
- s_v_web_id, can be obtained from the browser to access Douyin and then from the cookie.
- 例子/Example:
`s_v_web_id=verify_leytkxgn_kvO5kOmO_SdMs_4t1o_B5ml_BUqtWM1mP6BF;`
#### 备注/Note:
- 参数`douyin_video_url``video_id`二选一即可如果都填写优先使用`video_id`以获得更快的响应速度
- The parameters `douyin_video_url` and `video_id` can be selected, if both are filled in, the `video_id` is used first to get a faster response speed.
## 返回值/Return
- 用户当个视频数据的列表列表内包含JSON数据
- List of user single video data, list contains JSON data.
"""
if video_id is None or video_id == '':
# 获取视频ID
video_id = await api.get_douyin_video_id(douyin_video_url)
if video_id is None:
result = {
"status": "failed",
"platform": "douyin",
"message": "video_id获取失败/Failed to get video_id",
}
return ORJSONResponse(result)
if video_id is not None and video_id != '':
# 开始时间
start_time = time.time()
print('获取到的video_id数据:{}'.format(video_id))
if video_id is not None:
video_data = await api.get_douyin_video_data(video_id=video_id)
if video_data is None:
result = {
"status": "failed",
"platform": "douyin",
"endpoint": "/douyin_video_data/",
"message": "视频API数据获取失败/Failed to get video API data",
}
return ORJSONResponse(result)
# print('获取到的video_data:{}'.format(video_data))
# 记录API调用
await api_logs(start_time=start_time,
input_data={'douyin_video_url': douyin_video_url, 'video_id': video_id},
endpoint='douyin_video_data')
# 结束时间
total_time = float(format(time.time() - start_time, '.4f'))
# 返回数据
result = {
"status": "success",
"platform": "douyin",
"endpoint": "/douyin_video_data/",
"message": "获取视频数据成功/Got video data successfully",
"total_time": total_time,
"aweme_list": [video_data]
}
return ORJSONResponse(result)
else:
print('获取抖音video_id失败')
result = {
"status": "failed",
"platform": "douyin",
"endpoint": "/douyin_video_data/",
"message": "获取视频ID失败/Failed to get video ID",
"total_time": 0,
"aweme_list": []
}
return ORJSONResponse(result)
@app.get("/douyin_live_video_data/", response_class=ORJSONResponse, response_model=API_Video_Response, tags=["Douyin"])
@limiter.limit(Rate_Limit)
async def get_douyin_live_video_data(request: Request, douyin_live_video_url: str = None, web_rid: str = None):
"""
## 用途/Usage
- 获取抖音直播视频数据参数是视频链接|分享口令
- Get the data of a Douyin live video, the parameter is the video link.
## 失效待修复/Waiting for repair
"""
if web_rid is None or web_rid == '':
# 获取视频ID
web_rid = await api.get_douyin_video_id(douyin_live_video_url)
if web_rid is None:
result = {
"status": "failed",
"platform": "douyin",
"message": "web_rid获取失败/Failed to get web_rid",
}
return ORJSONResponse(result)
if web_rid is not None and web_rid != '':
# 开始时间
start_time = time.time()
print('获取到的web_rid:{}'.format(web_rid))
if web_rid is not None:
video_data = await api.get_douyin_live_video_data(web_rid=web_rid)
if video_data is None:
result = {
"status": "failed",
"platform": "douyin",
"endpoint": "/douyin_live_video_data/",
"message": "直播视频API数据获取失败/Failed to get live video API data",
}
return ORJSONResponse(result)
# print('获取到的video_data:{}'.format(video_data))
# 记录API调用
await api_logs(start_time=start_time,
input_data={'douyin_video_url': douyin_live_video_url, 'web_rid': web_rid},
endpoint='douyin_live_video_data')
# 结束时间
total_time = float(format(time.time() - start_time, '.4f'))
# 返回数据
result = {
"status": "success",
"platform": "douyin",
"endpoint": "/douyin_live_video_data/",
"message": "获取直播视频数据成功/Got live video data successfully",
"total_time": total_time,
"aweme_list": [video_data]
}
return ORJSONResponse(result)
else:
print('获取抖音video_id失败')
result = {
"status": "failed",
"platform": "douyin",
"endpoint": "/douyin_live_video_data/",
"message": "获取直播视频ID失败/Failed to get live video ID",
"total_time": 0,
"aweme_list": []
}
return ORJSONResponse(result)
@app.get("/douyin_profile_videos/", response_class=ORJSONResponse, response_model=None, tags=["Douyin"])
async def get_douyin_user_profile_videos(tikhub_token: str, douyin_user_url: str = None):
"""
## 用途/Usage
- 获取抖音用户主页数据参数是用户链接|ID
- Get the data of a Douyin user profile, the parameter is the user link or ID.
## 参数/Parameter
tikhub_token: https://api.tikhub.io/#/Authorization/login_for_access_token_user_login_post
"""
response = await api.get_douyin_user_profile_videos(tikhub_token=tikhub_token, profile_url=douyin_user_url)
return response
@app.get("/douyin_profile_liked_videos/", response_class=ORJSONResponse, response_model=None, tags=["Douyin"])
async def get_douyin_user_profile_liked_videos(tikhub_token: str, douyin_user_url: str = None):
"""
## 用途/Usage
- 获取抖音用户喜欢的视频数据参数是用户链接|ID
- Get the data of a Douyin user profile liked videos, the parameter is the user link or ID.
## 参数/Parameter
tikhub_token: https://api.tikhub.io/#/Authorization/login_for_access_token_user_login_post
"""
response = await api.get_douyin_profile_liked_data(tikhub_token=tikhub_token, profile_url=douyin_user_url)
return response
@app.get("/douyin_video_comments/", response_class=ORJSONResponse, response_model=None, tags=["Douyin"])
async def get_douyin_video_comments(tikhub_token: str, douyin_video_url: str = None):
"""
## 用途/Usage
- 获取抖音视频评论数据参数是视频链接|分享口令
- Get the data of a Douyin video comments, the parameter is the video link.
## 参数/Parameter
tikhub_token: https://api.tikhub.io/#/Authorization/login_for_access_token_user_login_post
"""
response = await api.get_douyin_video_comments(tikhub_token=tikhub_token, video_url=douyin_video_url)
return response
""" ________________________⬇TikTok视频解析端点(TikTok video parsing endpoint)⬇________________________"""
# 获取TikTok单个视频数据/Get TikTok single video data
@app.get("/tiktok_video_data/", response_class=ORJSONResponse, response_model=API_Video_Response, tags=["TikTok"])
@limiter.limit(Rate_Limit)
async def get_tiktok_video_data(request: Request, tiktok_video_url: str = None, video_id: str = None):
"""
## 用途/Usage
- 获取单个视频数据参数是视频链接| 分享口令
- Get single video data, the parameter is the video link.
## 参数/Parameter
#### tiktok_video_url(选填/Optional):
- 视频链接| 分享口令
- The video link.| Share code
- 例子/Example:
`https://www.tiktok.com/@evil0ctal/video/7156033831819037994`
`https://vm.tiktok.com/TTPdkQvKjP/`
#### video_id(选填/Optional):
- 视频ID可以从视频链接中获取
- The video ID, can be obtained from the video link.
- 例子/Example:
`7156033831819037994`
#### 备注/Note:
- 参数`tiktok_video_url``video_id`二选一即可如果都填写优先使用`video_id`以获得更快的响应速度
- The parameters `tiktok_video_url` and `video_id` can be selected, if both are filled in, the `video_id` is used first to get a faster response speed.
## 返回值/Return
- 用户当个视频数据的列表列表内包含JSON数据
- List of user single video data, list contains JSON data.
"""
# 开始时间
start_time = time.time()
if video_id is None or video_id == "":
video_id = await api.get_tiktok_video_id(tiktok_video_url)
if video_id is None:
return ORJSONResponse({"status": "fail", "platform": "tiktok", "endpoint": "/tiktok_video_data/",
"message": "获取视频ID失败/Get video ID failed"})
if video_id is not None and video_id != '':
print('开始解析单个TikTok视频数据')
video_data = await api.get_tiktok_video_data(video_id)
# TikTok的API数据如果为空或者返回的数据中没有视频数据就返回错误信息
# If the TikTok API data is empty or there is no video data in the returned data, an error message is returned
if video_data is None or video_data.get('aweme_id') != video_id:
print('视频数据获取失败/Failed to get video data')
result = {
"status": "failed",
"platform": "tiktok",
"endpoint": "/tiktok_video_data/",
"message": "视频数据获取失败/Failed to get video data"
}
return ORJSONResponse(result)
# 记录API调用
await api_logs(start_time=start_time,
input_data={'tiktok_video_url': tiktok_video_url, 'video_id': video_id},
endpoint='tiktok_video_data')
# 结束时间
total_time = float(format(time.time() - start_time, '.4f'))
# 返回数据
result = {
"status": "success",
"platform": "tiktok",
"endpoint": "/tiktok_video_data/",
"message": "获取视频数据成功/Got video data successfully",
"total_time": total_time,
"aweme_list": [video_data]
}
return ORJSONResponse(result)
else:
print('视频链接错误/Video link error')
result = {
"status": "failed",
"platform": "tiktok",
"endpoint": "/tiktok_video_data/",
"message": "视频链接错误/Video link error"
}
return ORJSONResponse(result)
# 获取TikTok用户视频数据/Get TikTok user video data
@app.get("/tiktok_profile_videos/", response_class=ORJSONResponse, response_model=None, tags=["TikTok"])
async def get_tiktok_profile_videos(tikhub_token: str, tiktok_video_url: str = None):
"""
## 用途/Usage
- 获取抖音用户主页数据参数是用户链接|ID
- Get the data of a Douyin user profile, the parameter is the user link or ID.
## 参数/Parameter
tikhub_token: https://api.tikhub.io/#/Authorization/login_for_access_token_user_login_post
"""
response = await api.get_tiktok_user_profile_videos(tikhub_token=tikhub_token, tiktok_video_url=tiktok_video_url)
return response
# 获取TikTok用户主页点赞视频数据/Get TikTok user profile liked video data
@app.get("/tiktok_profile_liked_videos/", response_class=ORJSONResponse, response_model=None, tags=["TikTok"])
async def get_tiktok_profile_liked_videos(tikhub_token: str, tiktok_video_url: str = None):
"""
## 用途/Usage
- 获取抖音用户主页点赞视频数据参数是用户链接|ID
- Get the data of a Douyin user profile liked video, the parameter is the user link or ID.
## 参数/Parameter
tikhub_token: https://api.tikhub.io/#/Authorization/login_for_access_token_user_login_post
"""
response = await api.get_tiktok_user_profile_liked_videos(tikhub_token=tikhub_token, tiktok_video_url=tiktok_video_url)
return response
""" ________________________⬇iOS快捷指令更新端点(iOS Shortcut update endpoint)⬇________________________"""
@app.get("/ios", response_model=iOS_Shortcut, tags=["iOS_Shortcut"])
async def Get_Shortcut():
data = {
'version': config["Web_API"]["iOS_Shortcut_Version"],
'update': config["Web_API"]['iOS_Shortcut_Update_Time'],
'link': config["Web_API"]['iOS_Shortcut_Link'],
'link_en': config["Web_API"]['iOS_Shortcut_Link_EN'],
'note': config["Web_API"]['iOS_Shortcut_Update_Note'],
'note_en': config["Web_API"]['iOS_Shortcut_Update_Note_EN'],
}
return ORJSONResponse(data)
""" ________________________⬇下载文件端点/函数(Download file endpoints/functions)⬇________________________"""
# 下载文件端点/Download file endpoint
@app.get("/download", tags=["Download"])
@limiter.limit(Rate_Limit)
async def download_file_hybrid(request: Request, url: str, prefix: bool = True, watermark: bool = False):
"""
## 用途/Usage
### [中文]
- [抖音|TikTok]链接作为参数提交至此端点返回[视频|图片]文件下载请求
### [English]
- Submit the [Douyin|TikTok] link as a parameter to this endpoint and return the [video|picture] file download request.
# 参数/Parameter
- url:str -> [Douyin|TikTok] [视频|图片] 链接/ [Douyin|TikTok] [video|image] link
- prefix: bool -> [True/False] 是否添加前缀/Whether to add a prefix
- watermark: bool -> [True/False] 是否添加水印/Whether to add a watermark
"""
# 是否开启此端点/Whether to enable this endpoint
if not config["Web_API"]["Download_Switch"]:
return ORJSONResponse({"status": "endpoint closed",
"message": "下载视频端点已关闭请在配置文件中开启/Download video endpoint is closed, please enable it in the configuration file"})
# 开始时间
start_time = time.time()
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36'
}
data = await api.hybrid_parsing(url)
if data is None:
return ORJSONResponse(data)
else:
# 记录API调用
await api_logs(start_time=start_time,
input_data={'url': url},
endpoint='download')
url_type = data.get('type')
platform = data.get('platform')
aweme_id = data.get('aweme_id')
file_name_prefix = config["Web_API"]["File_Name_Prefix"] if prefix else ''
root_path = config["Web_API"]["Download_Path"]
# 查看目录是否存在,不存在就创建
if not os.path.exists(root_path):
os.makedirs(root_path)
if url_type == 'video':
file_name = file_name_prefix + platform + '_' + aweme_id + '.mp4' if not watermark else file_name_prefix + platform + '_' + aweme_id + '_watermark' + '.mp4'
url = data.get('video_data').get('nwm_video_url_HQ') if not watermark else data.get('video_data').get(
'wm_video_url_HQ')
print('url: ', url)
file_path = root_path + "/" + file_name
print('file_path: ', file_path)
# 判断文件是否存在,存在就直接返回
if os.path.exists(file_path):
print('文件已存在,直接返回')
return FileResponse(path=file_path, media_type='video/mp4', filename=file_name)
else:
if platform == 'douyin':
async with aiohttp.ClientSession() as session:
async with session.get(url=url, headers=headers, allow_redirects=False) as response:
r = response.headers
cdn_url = r.get('location')
async with session.get(url=cdn_url) as res:
r = await res.content.read()
elif platform == 'tiktok':
async with aiohttp.ClientSession() as session:
async with session.get(url=url, headers=headers) as res:
r = await res.content.read()
with open(file_path, 'wb') as f:
f.write(r)
return FileResponse(path=file_path, media_type='video/mp4', filename=file_name)
elif url_type == 'image':
url = data.get('image_data').get('no_watermark_image_list') if not watermark else data.get(
'image_data').get('watermark_image_list')
print('url: ', url)
zip_file_name = file_name_prefix + platform + '_' + aweme_id + '_images.zip' if not watermark else file_name_prefix + platform + '_' + aweme_id + '_images_watermark.zip'
zip_file_path = root_path + "/" + zip_file_name
print('zip_file_name: ', zip_file_name)
print('zip_file_path: ', zip_file_path)
# 判断文件是否存在,存在就直接返回、
if os.path.exists(zip_file_path):
print('文件已存在,直接返回')
return FileResponse(path=zip_file_path, media_type='zip', filename=zip_file_name)
file_path_list = []
for i in url:
async with aiohttp.ClientSession() as session:
async with session.get(url=i, headers=headers) as res:
content_type = res.headers.get('content-type')
file_format = content_type.split('/')[1]
r = await res.content.read()
index = int(url.index(i))
file_name = file_name_prefix + platform + '_' + aweme_id + '_' + str(
index + 1) + '.' + file_format if not watermark else \
file_name_prefix + platform + '_' + aweme_id + '_' + str(
index + 1) + '_watermark' + '.' + file_format
file_path = root_path + "/" + file_name
file_path_list.append(file_path)
print('file_path: ', file_path)
with open(file_path, 'wb') as f:
f.write(r)
if len(url) == len(file_path_list):
zip_file = zipfile.ZipFile(zip_file_path, 'w')
for f in file_path_list:
zip_file.write(os.path.join(f), f, zipfile.ZIP_DEFLATED)
zip_file.close()
return FileResponse(path=zip_file_path, media_type='zip', filename=zip_file_name)
else:
return ORJSONResponse(data)
# 批量下载文件端点/Batch download file endpoint
@app.get("/batch_download", tags=["Download"])
async def batch_download_file(url_list: str, prefix: bool = True):
"""
批量下载文件端点/Batch download file endpoint
未完工/Unfinished
"""
print('url_list: ', url_list)
return ORJSONResponse({"status": "failed",
"message": "嘿嘿嘿,这个功能还没做呢,等我有空再做吧/Hehehe, this function hasn't been done yet, I'll do it when I have time"})
# 抖音链接格式下载端点(video)/Douyin link format download endpoint(video)
@app.get("/video/{aweme_id}", tags=["Download"])
async def download_douyin_video(aweme_id: str, prefix: bool = True, watermark: bool = False):
"""
## 用途/Usage
### [中文]
- 将抖音域名改为当前服务器域名即可调用此端点返回[视频|图片]文件下载请求
- 例如原链接https://douyin.com/video/1234567890123456789 改成 https://api.douyin.wtf/video/1234567890123456789 即可调用此端点
### [English]
- Change the Douyin domain name to the current server domain name to call this endpoint and return the video file download request.
- For example, the original link: https://douyin.com/video/1234567890123456789 becomes https://api.douyin.wtf/video/1234567890123456789 to call this endpoint.
# 参数/Parameter
- aweme_id:str -> 抖音视频ID/Douyin video ID
- prefix: bool -> [True/False] 是否添加前缀/Whether to add a prefix
- watermark: bool -> [True/False] 是否添加水印/Whether to add a watermark
"""
# 是否开启此端点/Whether to enable this endpoint
if not config["Web_API"]["Download_Switch"]:
return ORJSONResponse({"status": "endpoint closed",
"message": "此端点已关闭请在配置文件中开启/This endpoint is closed, please enable it in the configuration file"})
video_url = f"https://www.douyin.com/video/{aweme_id}"
download_url = f"{domain}/download?url={video_url}&prefix={prefix}&watermark={watermark}"
return RedirectResponse(download_url)
# 抖音链接格式下载端点(video)/Douyin link format download endpoint(video)
@app.get("/note/{aweme_id}", tags=["Download"])
async def download_douyin_video(aweme_id: str, prefix: bool = True, watermark: bool = False):
"""
## 用途/Usage
### [中文]
- 将抖音域名改为当前服务器域名即可调用此端点返回[视频|图片]文件下载请求
- 例如原链接https://douyin.com/video/1234567890123456789 改成 https://api.douyin.wtf/video/1234567890123456789 即可调用此端点
### [English]
- Change the Douyin domain name to the current server domain name to call this endpoint and return the video file download request.
- For example, the original link: https://douyin.com/video/1234567890123456789 becomes https://api.douyin.wtf/video/1234567890123456789 to call this endpoint.
# 参数/Parameter
- aweme_id:str -> 抖音视频ID/Douyin video ID
- prefix: bool -> [True/False] 是否添加前缀/Whether to add a prefix
- watermark: bool -> [True/False] 是否添加水印/Whether to add a watermark
"""
# 是否开启此端点/Whether to enable this endpoint
if not config["Web_API"]["Download_Switch"]:
return ORJSONResponse({"status": "endpoint closed",
"message": "此端点已关闭请在配置文件中开启/This endpoint is closed, please enable it in the configuration file"})
video_url = f"https://www.douyin.com/video/{aweme_id}"
download_url = f"{domain}/download?url={video_url}&prefix={prefix}&watermark={watermark}"
return RedirectResponse(download_url)
# 抖音链接格式下载端点/Douyin link format download endpoint
@app.get("/discover", tags=["Download"])
async def download_douyin_discover(modal_id: str, prefix: bool = True, watermark: bool = False):
"""
## 用途/Usage
### [中文]
- 将抖音域名改为当前服务器域名即可调用此端点返回[视频|图片]文件下载请求
- 例如原链接https://www.douyin.com/discover?modal_id=1234567890123456789 改成 https://api.douyin.wtf/discover?modal_id=1234567890123456789 即可调用此端点
### [English]
- Change the Douyin domain name to the current server domain name to call this endpoint and return the video file download request.
- For example, the original link: https://douyin.com/discover?modal_id=1234567890123456789 becomes https://api.douyin.wtf/discover?modal_id=1234567890123456789 to call this endpoint.
# 参数/Parameter
- modal_id: str -> 抖音视频ID/Douyin video ID
- prefix: bool -> [True/False] 是否添加前缀/Whether to add a prefix
- watermark: bool -> [True/False] 是否添加水印/Whether to add a watermark
"""
# 是否开启此端点/Whether to enable this endpoint
if not config["Web_API"]["Download_Switch"]:
return ORJSONResponse({"status": "endpoint closed",
"message": "此端点已关闭请在配置文件中开启/This endpoint is closed, please enable it in the configuration file"})
video_url = f"https://www.douyin.com/discover?modal_id={modal_id}"
download_url = f"{domain}/download?url={video_url}&prefix={prefix}&watermark={watermark}"
return RedirectResponse(download_url)
# Tiktok链接格式下载端点(video)/Tiktok link format download endpoint(video)
@app.get("/{user_id}/video/{aweme_id}", tags=["Download"])
async def download_tiktok_video(user_id: str, aweme_id: str, prefix: bool = True, watermark: bool = False):
"""
## 用途/Usage
### [中文]
- 将TikTok域名改为当前服务器域名即可调用此端点返回[视频|图片]文件下载请求
- 例如原链接https://www.tiktok.com/@evil0ctal/video/7156033831819037994 改成 https://api.douyin.wtf/@evil0ctal/video/7156033831819037994 即可调用此端点
### [English]
- Change the TikTok domain name to the current server domain name to call this endpoint and return the video file download request.
- For example, the original link: https://www.tiktok.com/@evil0ctal/video/7156033831819037994 becomes https://api.douyin.wtf/@evil0ctal/video/7156033831819037994 to call this endpoint.
# 参数/Parameter
- user_id: str -> TikTok用户ID/TikTok user ID
- aweme_id: str -> TikTok视频ID/TikTok video ID
- prefix: bool -> [True/False] 是否添加前缀/Whether to add a prefix
- watermark: bool -> [True/False] 是否添加水印/Whether to add a watermark
"""
# 是否开启此端点/Whether to enable this endpoint
if not config["Web_API"]["Download_Switch"]:
return ORJSONResponse({"status": "endpoint closed",
"message": "此端点已关闭请在配置文件中开启/This endpoint is closed, please enable it in the configuration file"})
video_url = f"https://www.tiktok.com/{user_id}/video/{aweme_id}"
download_url = f"{domain}/download?url={video_url}&prefix={prefix}&watermark={watermark}"
return RedirectResponse(download_url)
# 定期清理[Download_Path]文件夹
# Periodically clean the [Download_Path] folder
def cleanup_path():
while True:
root_path = config["Web_API"]["Download_Path"]
timer = int(config["Web_API"]["Download_Path_Clean_Timer"])
# 查看目录是否存在,不存在就跳过
if os.path.exists(root_path):
time_now = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime())
print(f"{time_now}: Cleaning up the download folder...")
for file in os.listdir("./download"):
file_path = os.path.join("./download", file)
try:
if os.path.isfile(file_path):
os.remove(file_path)
except Exception as e:
print(e)
else:
time_now = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime())
print(f"{time_now}: The download folder does not exist, skipping...")
time.sleep(timer)
""" ________________________⬇项目启动执行函数(Project start execution function)⬇________________________"""
# 程序启动后执行/Execute after program startup
@app.on_event("startup")
async def startup_event():
# 创建一个清理下载目录定时器线程并启动
# Create a timer thread to clean up the download directory and start it
download_path_clean_switches = True if config["Web_API"]["Download_Path_Clean_Switch"] else False
if download_path_clean_switches:
# 启动清理线程/Start cleaning thread
thread_1 = threading.Thread(target=cleanup_path)
thread_1.start()
if __name__ == '__main__':
# 建议使用gunicorn启动使用uvicorn启动时请将debug设置为False
# It is recommended to use gunicorn to start, when using uvicorn to start, please set debug to False
# uvicorn web_api:app --host '0.0.0.0' --port 8000 --reload
uvicorn.run("web_api:app", host='0.0.0.0', port=port, reload=True, access_log=False)

View File

@ -1,399 +0,0 @@
#!/usr/bin/env python
# -*- encoding: utf-8 -*-
# @Author: https://github.com/Evil0ctal/
# @Time: 2021/11/06
# @Update: 2024/03/25
# @Version: 3.1.8
# @Function:
# 用于在线批量解析Douyin/TikTok的无水印视频/图集。
# 基于 PyWebIO将scraper.py返回的内容显示在网页上。
import yaml
import os
import re
import time
from scraper import Scraper
from pywebio import *
from pywebio import config as pywebio_config
from pywebio.input import *
from pywebio.output import *
from pywebio.session import info as session_info, run_asyncio_coroutine
# 读取配置文件
with open('config.yml', 'r', encoding='utf-8') as yaml_file:
config = yaml.safe_load(yaml_file)
# 创建一个Scraper类的实例/Create an instance of the Scraper class
api = Scraper()
# 自动检测语言返回翻译/Auto detect language to return translation
def t(zh: str, en: str) -> str:
return zh if 'zh' in session_info.user_language else en
# 解析抖音分享口令中的链接并返回列表/Parse the link in the Douyin share command and return a list
def find_url(string: str) -> list:
url = re.findall('http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\(\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+', string)
return url
# 校验输入值/Validate input value
def valid_check(input_data: str) -> str or None:
# 检索出所有链接并返回列表/Retrieve all links and return a list
url_list = find_url(input_data)
# 总共找到的链接数量/Total number of links found
total_urls = len(url_list)
if total_urls == 0:
return t('没有检测到有效的链接,请检查输入的内容是否正确。',
'No valid link detected, please check if the input content is correct.')
else:
# 最大接受提交URL的数量/Maximum number of URLs accepted
max_urls = config['Web_APP']['Max_Take_URLs']
if total_urls > int(max_urls):
warn_info = t('URL数量过多只会处理前{}个URL。'.format(max_urls),
'Too many URLs, only the first {} URLs will be processed.'.format(max_urls))
return warn_info
# 错误处理/Error handling
def error_do(reason: str, value: str) -> None:
# 输出一个毫无用处的信息
put_html("<hr>")
put_error(
t("发生了了意料之外的错误,输入值已被记录。", "An unexpected error occurred, the input value has been recorded."))
put_html('<h3>⚠{}</h3>'.format(t('详情', 'Details')))
put_table([
[t('原因', 'reason'), t('输入值', 'input value')],
[reason, value]])
put_markdown(t('可能的原因:', 'Possible reasons:'))
put_markdown(t('服务器可能被目标主机的防火墙限流(稍等片刻后再次尝试)',
'The server may be limited by the target host firewall (try again after a while)'))
put_markdown(t('输入了错误的链接(API-V1暂不支持主页链接解析)',
'Entered the wrong link (the home page link is not supported for parsing with API-V1)'))
put_markdown(
t('如果需要解析个人主页请使用TikHub_API', 'If you need to parse the personal homepage, please use TikHub_API'))
put_markdown(t('TikHub_API 文档: [https://api.tikhub.io/docs](https://api.tikhub.io/docs)',
'TikHub_API Documentation: [https://api.tikhub.io/docs](https://api.tikhub.io/docs)'))
put_markdown(t('该视频已经被删除或屏蔽(你看的都是些啥(⊙_⊙)?)',
'The video has been deleted or blocked (what are you watching (⊙_⊙)?)'))
put_markdown(t('其他原因(请联系作者)', 'Other reasons (please contact the author)'))
put_markdown(t('你可以在右上角的关于菜单中查看本站错误日志。',
'You can view the error log of this site in the about menu in the upper right corner.'))
put_markdown('[{}](https://github.com/Evil0ctal/Douyin_TikTok_Download_API/issues)'.format(
t('点击此处在GitHub上进行反馈', 'Click here to give feedback on GitHub')))
put_html("<hr>")
if config['Web_APP']['Allow_Logs']:
# 如果douyin或tiktok在输入值中则记录到日志文件/If douyin or tiktok is in the input value, record it to the log file
if 'douyin' in value or 'tiktok' in value:
# 将错误记录在logs.txt中
error_date = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime())
print(f"{error_date}: 正在记录错误信息...")
with open('logs.txt', 'a') as f:
f.write(error_date + ":\n" + str(reason) + '\n' + "Input value: " + value + '\n')
else:
print(t('输入值中没有douyin或tiktok不记录到日志文件中',
'No douyin or tiktok in the input value, not recorded to the log file'))
# iOS快捷指令弹窗/IOS shortcut pop-up
def ios_pop_window():
with popup(t("iOS快捷指令", "iOS Shortcut")):
version = config["Web_API"]["iOS_Shortcut_Version"]
update = config["Web_API"]['iOS_Shortcut_Update_Time']
link = config["Web_API"]['iOS_Shortcut_Link']
link_en = config["Web_API"]['iOS_Shortcut_Link_EN']
note = config["Web_API"]['iOS_Shortcut_Update_Note']
note_en = config["Web_API"]['iOS_Shortcut_Update_Note_EN']
put_markdown(t('#### 📢 快捷指令介绍:', '#### 📢 Shortcut Introduction:'))
put_markdown(
t('快捷指令运行在iOS平台本快捷指令可以快速调用本项目的公共API将抖音或TikTok的视频或图集下载到你的手机相册中暂时只支持单个链接进行下载。',
'The shortcut runs on the iOS platform, and this shortcut can quickly call the public API of this project to download the video or album of Douyin or TikTok to your phone album. It only supports single link download for now.'))
put_markdown(t('#### 📲 使用方法 ①:', '#### 📲 Operation method ①:'))
put_markdown(t('在抖音或TikTok的APP内浏览你想要无水印保存的视频或图集。',
'The shortcut needs to be used in the Douyin or TikTok app, browse the video or album you want to save without watermark.'))
put_markdown(t('然后点击右下角分享按钮,选择更多,然后下拉找到 "抖音TikTok无水印下载" 这个选项。',
'Then click the share button in the lower right corner, select more, and then scroll down to find the "Douyin TikTok No Watermark Download" option.'))
put_markdown(t('如遇到通知询问是否允许快捷指令访问xxxx (域名或服务器),需要点击允许才可以正常使用。',
'If you are asked whether to allow the shortcut to access xxxx (domain name or server), you need to click Allow to use it normally.'))
put_markdown(t('该快捷指令会在你相册创建一个新的相薄方便你浏览保存的内容。',
'The shortcut will create a new album in your photo album to help you browse the saved content.'))
put_markdown(t('#### 📲 使用方法 ②:', '#### 📲 Operation method ②:'))
put_markdown(t('在抖音或TikTok的视频下方点击分享然后点击复制链接然后去快捷指令APP中运行该快捷指令。',
'Click share below the video of Douyin or TikTok, then click to copy the link, then go to the shortcut command APP to run the shortcut command.'))
put_markdown(t('如果弹窗询问是否允许读取剪切板请同意,随后快捷指令将链接内容保存至相册中。',
'if the pop-up window asks whether to allow reading the clipboard, please agree, and then the shortcut command will save the link content to the album middle.'))
put_html('<hr>')
put_text(t(f"最新快捷指令版本: {version}", f"Latest shortcut version: {version}"))
put_text(t(f"快捷指令更新时间: {update}", f"Shortcut update time: {update}"))
put_text(t(f"快捷指令更新内容: {note}", f"Shortcut update content: {note_en}"))
put_link("[点击获取快捷指令 - 中文]", link, new_window=True)
put_html("<br>")
put_link("[Click get Shortcut - English]", link_en, new_window=True)
# API文档弹窗/API documentation pop-up
def api_document_pop_window():
with popup(t("API文档", "API Document")):
put_markdown(t("💾TikHub_API文档", "💾TikHub_API Document"))
put_markdown(t('TikHub_API 支持抖音和TikTok的更多接口 如主页解析,视频解析,视频评论解析,个人点赞列表解析等...',
'TikHub_API supports more interfaces of Douyin and TikTok, such as home page parsing, video parsing, video comment parsing, personal like list parsing, etc...'))
put_link('[TikHub_API Docs]', 'https://api.tikhub.io/docs', new_window=True)
put_html('<hr>')
put_markdown(t("💽API-V1文档", "💽API-V1 Document"))
put_markdown(t("API-V1 支持抖音和TikTok的单一视频解析具体请查看接口文档。",
"API-V1 supports single video parsing of Douyin and TikTok. For details, please refer to the API documentation."))
put_link('[API-V1 Docs]', 'https://api.douyin.wtf/docs', new_window=True)
# 日志文件弹窗/Log file pop-up
def log_popup_window():
with popup(t('错误日志', 'Error Log')):
put_html('<h3>⚠️{}</h3>'.format('关于解析失败可能的原因', 'About the possible reasons for parsing failure'))
put_markdown(t('服务器可能被目标主机的防火墙限流(稍等片刻后再次尝试)',
'The server may be limited by the target host firewall (try again after a while)'))
put_markdown(t('输入了错误的链接(API-V1暂不支持主页链接解析)',
'Entered the wrong link (the home page link is not supported for parsing with API-V1)'))
put_markdown(
t('如果需要解析个人主页请使用TikHub_API', 'If you need to parse the personal homepage, please use TikHub_API'))
put_markdown(t('TikHub_API 文档: [https://api.tikhub.io/docs](https://api.tikhub.io/docs)',
'TikHub_API Documentation: [https://api.tikhub.io/docs](https://api.tikhub.io/docs)'))
put_markdown(t('该视频已经被删除或屏蔽(你看的都是些啥(⊙_⊙)?)',
'The video has been deleted or blocked (what are you watching (⊙_⊙)?)'))
put_markdown(t('[点击此处在GitHub上进行反馈](https://github.com/Evil0ctal/Douyin_TikTok_Download_API/issues)',
'[Click here to feedback on GitHub](https://github.com/Evil0ctal/Douyin_TikTok_Download_API/issues)'))
put_html('<hr>')
# 判断日志文件是否存在
if os.path.exists('logs.txt'):
put_text(t('点击logs.txt可下载日志:', 'Click logs.txt to download the log:'))
content = open(r'./logs.txt', 'rb').read()
put_file('logs.txt', content=content)
with open('./logs.txt', 'r') as f:
content = f.read()
put_text(str(content))
else:
put_text(t('日志文件不存在,请等发生错误时再回来看看。',
'The log file does not exist, please come back and take a look when an error occurs.'))
# 关于弹窗/About pop-up
def about_popup_window():
with popup(t('更多信息', 'More Information')):
put_html('<h3>👀{}</h3>'.format(t('访问记录', 'Visit Record')))
put_image('https://views.whatilearened.today/views/github/evil0ctal/TikTokDownload_PyWebIO.svg',
title='访问记录')
put_html('<hr>')
put_html('<h3>⭐Github</h3>')
put_markdown('[Douyin_TikTok_Download_API](https://github.com/Evil0ctal/Douyin_TikTok_Download_API)')
put_html('<hr>')
put_html('<h3>🎯{}</h3>'.format(t('反馈', 'Feedback')))
put_markdown('{}[issues](https://github.com/Evil0ctal/Douyin_TikTok_Download_API/issues)'.format(
t('Bug反馈', 'Bug Feedback')))
put_html('<hr>')
put_html('<h3>💖WeChat</h3>')
put_markdown('WeChat[Evil0ctal](https://mycyberpunk.com/)')
put_html('<hr>')
# 程序入口/Main interface
@pywebio_config(theme='minty', title='Douyin/TikTok online parsing and download without watermark | TikTok/抖音无水印在线解析下载', description='在线批量解析TikTok/抖音视频和图片,支持无水印下载,官方数据接口,稳定,开源,免费,无广告。| Online batch parsing of TikTok/Douyin videos and pictures, support for no watermark download, official data interface, stable, open source, free, no ads.')
async def main():
# 设置favicon
favicon_url = "https://raw.githubusercontent.com/Evil0ctal/Douyin_TikTok_Download_API/main/logo/logo192.png"
# 删除初始keywords, icon meta标签
session.run_js("""
$('head meta[name="keywords"]').remove();
$('head link[rel="icon"]').remove();
""")
# 关键字信息
keywords = config['Web_APP']['Keywords']
# 设置favicon,referrer,Keywords,Description,Author,Title
session.run_js(f"""
$('head').append('<link rel="icon" type="image/png" href="{favicon_url}">')
$('head').append('<meta name=referrer content=no-referrer>');
$('head').append('<meta name="keywords" content="{keywords}">')
$('head').append('<meta name="author" content="Evil0ctal">')
""")
# 修改footer
session.run_js("""$('footer').remove()""")
# 网站标题/Website title
title = t(config['Web_APP']['Web_Title'], config['Web_APP']['Web_Title_English'])
put_html(f"""
<div align="center">
<a href="https://douyin.wtf/" alt="logo" ><img src="{favicon_url}" width="100"/></a>
<h1 align="center">{title}</h1>
</div>
""")
put_row(
[put_button(t("快捷指令", 'Shortcuts'), onclick=lambda: ios_pop_window(), link_style=True, small=True),
put_button("API", onclick=lambda: api_document_pop_window(), link_style=True, small=True),
put_button(t("日志", "Log"), onclick=lambda: log_popup_window(), link_style=True, small=True),
put_button(t("关于", 'About'), onclick=lambda: about_popup_window(), link_style=True, small=True)
])
placeholder = t(
"批量解析请直接粘贴多个口令或链接无需使用符号分开支持抖音和TikTok链接混合暂时不支持作者主页链接批量解析。",
"Batch parsing, please paste multiple passwords or links directly, no need to use symbols to separate, support for mixing Douyin and TikTok links, temporarily not support for author home page link batch parsing.")
input_data = await textarea(t('请将抖音或TikTok的分享口令或网址粘贴于此',
"Please paste the share code or URL of [Douyin|TikTok] here"),
type=TEXT,
validate=valid_check, required=True,
placeholder=placeholder,
position=0)
url_lists = find_url(input_data)
# 解析开始时间
start = time.time()
# 成功/失败统计
success_count = 0
failed_count = 0
# 链接总数
url_count = len(url_lists)
# 解析成功的url
success_list = []
# 解析失败的url
failed_list = []
# 输出一个提示条
with use_scope('loading_text'):
# 输出一个分行符
put_row([put_html('<br>')])
put_warning(t('Server酱正收到你输入的链接啦(◍•ᴗ•◍)\n正在努力处理中,请稍等片刻...',
'ServerChan is receiving your input link! (◍•ᴗ•◍)\nEfforts are being made, please wait a moment...'))
# 结果页标题
put_scope('result_title')
# 遍历链接列表
for url in url_lists:
# 链接编号
url_index = url_lists.index(url) + 1
# 解析
data = await run_asyncio_coroutine(api.hybrid_parsing(video_url=url))
# 判断是否解析成功/失败
status = True if data.get('status') == 'success' else False
# 如果解析成功
if status:
# 创建一个视频/图集的公有变量
url_type = t('视频', 'Video') if data.get('type') == 'video' else t('图片', 'Image')
platform = data.get('platform')
table_list = [[t('类型', 'type'), t('内容', 'content')],
[t('解析类型', 'Type'), url_type],
[t('平台', 'Platform'), platform],
[f'{url_type} ID', data.get('aweme_id')],
[t(f'{url_type}描述', 'Description'), data.get('desc')],
[t('作者昵称', 'Author nickname'), data.get('author').get('nickname')],
[t('作者ID', 'Author ID'), data.get('author').get('unique_id')],
[t('API链接', 'API URL'),
put_link(t('点击查看', 'Click to view'),
f"{config['Web_API']['Domain']}/api?url={url}&minimal=false",
new_window=True)],
[t('API链接-精简', 'API URL-Minimal'),
put_link(t('点击查看', 'Click to view'),
f"{config['Web_API']['Domain']}/api?url={url}&minimal=true",
new_window=True)]
]
# 如果是视频/If it's video
if url_type == t('视频', 'Video'):
# 添加视频信息
table_list.insert(4, [t('视频链接-水印', 'Video URL-Watermark'),
put_link(t('点击查看', 'Click to view'),
data.get('video_data').get('wm_video_url_HQ'), new_window=True)])
table_list.insert(5, [t('视频链接-无水印', 'Video URL-No Watermark'),
put_link(t('点击查看', 'Click to view'),
data.get('video_data').get('nwm_video_url_HQ'), new_window=True)])
table_list.insert(6, [t('视频下载-水印', 'Video Download-Watermark'),
put_link(t('点击下载', 'Click to download'),
f"{config['Web_API']['Domain']}/download?url={url}&prefix=true&watermark=true",
new_window=True)])
table_list.insert(6, [t('视频下载-无水印', 'Video Download-No-Watermark'),
put_link(t('点击下载', 'Click to download'),
f"{config['Web_API']['Domain']}/download?url={url}&prefix=true&watermark=false",
new_window=True)])
# 如果是图片/If it's image
elif url_type == t('图片', 'Image'):
# 添加图片下载链接
table_list.insert(4, [t('图片打包下载-水印', 'Download images ZIP-Watermark'),
put_link(t('点击下载', 'Click to download'),
f"{config['Web_API']['Domain']}/download?url={url}&prefix=true&watermark=true",
new_window=True)])
table_list.insert(5, [t('图片打包下载-无水印', 'Download images ZIP-No-Watermark'),
put_link(t('点击下载', 'Click to download'),
f"{config['Web_API']['Domain']}/download?url={url}&prefix=true&watermark=false",
new_window=True)])
# 添加图片信息
no_watermark_image_list = data.get('image_data').get('no_watermark_image_list')
for image in no_watermark_image_list:
table_list.append([t('图片预览(如格式可显示): ', 'Image preview (if the format can be displayed):'),
put_image(image, width='50%')])
table_list.append([t('图片直链: ', 'Image URL:'),
put_link(t('⬆️点击打开图片⬆️', 'Click to open image⬆'), image,
new_window=True)])
# 向网页输出表格/Put table on web page
with use_scope(str(url_index)):
# 显示进度
put_info(
t(f'正在解析第{url_index}/{url_count}个链接: ', f'Parsing the {url_index}/{url_count}th link: '),
put_link(url, url, new_window=True), closable=True)
put_table(table_list)
put_html('<hr>')
scroll_to(str(url_index))
success_count += 1
success_list.append(url)
# print(f'success_count: {success_count}, success_list: {success_list}')
# 如果解析失败/Failed to parse
else:
failed_count += 1
failed_list.append(url)
# print(f'failed_count: {failed_count}, failed_list: {failed_list}')
error_msg = data.get('message').split('/')
error_msg = t(error_msg[0], error_msg[1])
with use_scope(str(url_index)):
error_do(reason=error_msg, value=url)
scroll_to(str(url_index))
# 全部解析完成跳出for循环/All parsing completed, break out of for loop
with use_scope('result_title'):
put_row([put_html('<br>')])
put_markdown(t('## 📝解析结果:', '## 📝Parsing results:'))
put_row([put_html('<br>')])
with use_scope('result'):
# 清除进度条
clear('loading_text')
# 滚动至result
scroll_to('result')
# for循环结束向网页输出成功提醒
put_success(t('解析完成啦 ♪(・ω・)ノ\n请查看以下统计信息如果觉得有用的话请在GitHub上帮我点一个Star吧',
'Parsing completed ♪(・ω・)ノ\nPlease check the following statistics, and if you think it\'s useful, please help me click a Star on GitHub!'))
# 将成功,失败以及总数量显示出来并且显示为代码方便复制
put_markdown(
f'**{t("成功", "Success")}:** {success_count} **{t("失败", "Failed")}:** {failed_count} **{t("总数量", "Total")}:** {success_count + failed_count}')
# 成功列表
if success_count != url_count:
put_markdown(f'**{t("成功列表", "Success list")}:**')
put_code('\n'.join(success_list))
# 失败列表
if failed_count > 0:
put_markdown(f'**{t("失败列表", "Failed list")}:**')
put_code('\n'.join(failed_list))
# 将url_lists显示为代码方便复制
put_markdown(t('**以下是您输入的所有链接:**', '**The following are all the links you entered:**'))
put_code('\n'.join(url_lists))
# 解析结束时间
end = time.time()
# 计算耗时,保留两位小数
time_consuming = round(end - start, 2)
# 显示耗时
put_markdown(f"**{t('耗时', 'Time consuming')}:** {time_consuming}s")
# 放置一个按钮,点击后跳转到顶部
put_button(t('回到顶部', 'Back to top'), onclick=lambda: scroll_to('1'), color='success', outline=True)
# 返回主页链接
put_link(t('再来一波 (つ´ω`)つ', 'Another wave (つ´ω`)つ'), '/')
if __name__ == '__main__':
# 获取空闲端口
if os.environ.get('PORT'):
port = int(os.environ.get('PORT'))
else:
# 在这里修改默认端口(记得在防火墙放行该端口)
port = int(config['Web_APP']['Port'])
# 判断是否使用CDN加载前端资源
cdn = True if config['Web_APP']['PyWebIO_CDN'] == 'True' else False
# 启动Web服务\Start Web service
start_server(main, port=port, debug=False, cdn=cdn)