diff --git a/PyPi/LICENSE b/PyPi/LICENSE deleted file mode 100644 index fd91e63..0000000 --- a/PyPi/LICENSE +++ /dev/null @@ -1,21 +0,0 @@ -MIT License - -Copyright (c) 2021 Evil0ctal - -Permission is hereby granted, free of charge, to any person obtaining a copy -of this software and associated documentation files (the "Software"), to deal -in the Software without restriction, including without limitation the rights -to use, copy, modify, merge, publish, distribute, sublicense, and/or sell -copies of the Software, and to permit persons to whom the Software is -furnished to do so, subject to the following conditions: - -The above copyright notice and this permission notice shall be included in all -copies or substantial portions of the Software. - -THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR -IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, -FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE -AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER -LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, -OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE -SOFTWARE. diff --git a/PyPi/README.md b/PyPi/README.md deleted file mode 100644 index dca615b..0000000 --- a/PyPi/README.md +++ /dev/null @@ -1,502 +0,0 @@ -

-
- -
- Douyin_TikTok_Download_API(抖音/TikTok无水印解析API) -
-

- -

- 运行说明 • - API使用 • - 手动部署 • - Docker部署 • - Docker镜像 • - 贡献者 -

- -
- -![](https://views.whatilearened.today/views/github/Evil0ctal/TikTokDownloader_PyWebIO.svg) -[![GitHub license](https://img.shields.io/github/license/Evil0ctal/TikTokDownloader_PyWebIO)](https://github.com/Evil0ctal/TikTokDownloader_PyWebIO/blob/main/LICENSE) -[![GitHub issues](https://img.shields.io/github/issues/Evil0ctal/TikTokDownloader_PyWebIO)](https://github.com/Evil0ctal/TikTokDownloader_PyWebIO/issues) -[![GitHub forks](https://img.shields.io/github/forks/Evil0ctal/TikTokDownloader_PyWebIO)](https://github.com/Evil0ctal/TikTokDownloader_PyWebIO/network) -[![GitHub stars](https://img.shields.io/github/stars/Evil0ctal/TikTokDownloader_PyWebIO)](https://github.com/Evil0ctal/TikTokDownloader_PyWebIO/stargazers) -[![Docker Image size](https://img.shields.io/docker/image-size/evil0ctal/douyin_tiktok_download_api?style=flat-square)](https://hub.docker.com/repository/docker/evil0ctal/douyin_tiktok_download_api) - -Language: [[English](https://github.com/Evil0ctal/Douyin_TikTok_Download_API/blob/main/README.en.md)] [[简体中文](https://github.com/Evil0ctal/Douyin_TikTok_Download_API/blob/main/README.md)] [[繁体中文](https://github.com/Evil0ctal/Douyin_TikTok_Download_API/blob/main/README.zh-TW.md)] - -> Note: This API is applicable to Douyin and TikTok. Douyin is TikTok in China. You can distribute or modify the code at -> will, but please mark the original author. - -## 👻介绍 - -> 出于稳定性的考虑,暂时关闭演示站的/video(返回mp4文件)和/music(返回mp3文件) -> 这两个功能,同时结果页面的批量下载功能也暂时不可用,如有需求请自行部署,其他功能在演示站上仍正常使用,API服务器保证99%的时间正常运行,但不保证解析100%成功,如果解析失败请等一两分钟后重试。 - -🚀演示地址:[https://douyin.wtf/](https://douyin.wtf/) - -🛰API演示:[https://api.douyin.wtf/](https://api.douyin.wtf/) - -💾iOS快捷指令(中文): [点击获取](https://www.icloud.com/shortcuts/331073aca78345cf9ab4f73b6a457f97) ( -更新于2022/07/18,快捷指令可自动检查更新,安装一次即可。) - -🌎iOS Shortcut(English): [Click to get](https://www.icloud.com/shortcuts/83548306bc0c4f8ea563108f79c73f8d) (Updated on -2022/07/18, this shortcut will automatically check for updates, only need to install it once.) - -🗂快捷指令历史版本: [Shortcuts release](https://github.com/Evil0ctal/Douyin_TikTok_Download_API/issues/53) - -📦️Tiktok/抖音下载器(桌面应用):[TikDown](https://github.com/Tairraos/TikDown/) - -本项目使用 [PyWebIO](https://github.com/pywebio/PyWebIO)、[Flask](https://github.com/pallets/flask) -,利用Python实现在线批量解析抖音的无水印视频/图集。 - -可用于下载作者禁止下载的视频,或者进行数据爬取等等,同时可搭配[iOS自带的快捷指令APP](https://apps.apple.com/cn/app/%E5%BF%AB%E6%8D%B7%E6%8C%87%E4%BB%A4/id915249334) -配合本项目API实现应用内下载。 - -快捷指令需要在抖音或TikTok的APP内,选择你想要保存的视频,点击分享按钮,然后找到 "抖音TikTok无水印下载" -这个选项,如遇到通知询问是否允许快捷指令访问xxxx (域名或服务器),需要点击允许才可以正常使用,下载成功的视频或图集会保存在一个专门的相册中以方便浏览。 - -## 💡项目文件结构 - -``` -# 请根据需要自行修改config.ini中的内容 -. -└── Douyin_TikTok_Download_API/ - ├── /static(静态前端资源) - ├── web_zh.py(网页入口) - ├── web_api.py(API) - ├── scraper.py(解析库) - ├── config.ini(所有项目的配置文件,包含端口及代理等,如需请自行修改该文件。) - ├── logs.txt(错误日志,自动生成。) - └── API_logs.txt(API调用日志,自动生成。) -``` - -## 💯已支持功能: - -- 支持抖音视频/图集解析 -- 支持海外TikTok视频解析 -- 支持批量解析(支持抖音/TikTok混合解析) -- 解析结果页批量下载无水印视频 -- 制作[pip包](https://pypi.org/project/DT-Scraper/)方便使用 -- 支持API调用 -- 支持使用代理解析 -- 支持[iOS快捷指令](https://apps.apple.com/cn/app/%E5%BF%AB%E6%8D%B7%E6%8C%87%E4%BB%A4/id915249334)实现应用内下载无水印视频/图集 - ---- - -## 🤦‍后续功能: - -- [ ] 支持输入(抖音/TikTok)作者主页链接实现批量解析 - ---- - -## 🧭运行说明(经过测试过的Python版本为3.8): -> 🚨如果你要部署本项目,请参考部署方式([Docker部署](https://github.com/Evil0ctal/Douyin_TikTok_Download_API/blob/main/README.md#%E9%83%A8%E7%BD%B2%E6%96%B9%E5%BC%8F%E4%BA%8C-docker "Docker部署"), [手动部署](https://github.com/Evil0ctal/Douyin_TikTok_Download_API/blob/main/README.md#%E9%83%A8%E7%BD%B2%E6%96%B9%E5%BC%8F%E4%B8%80-%E6%89%8B%E5%8A%A8%E9%83%A8%E7%BD%B2 "手动部署")) - -- 克隆本仓库: - -```console -git clone https://github.com/Evil0ctal/Douyin_TikTok_Download_API.git -``` - -- 移动至仓库目录: - -```console -cd Douyin_TikTok_Download_API -``` - -- 安装依赖库: - -```console -pip install -r requirements.txt -``` - -- 修改config.ini(可选): - -```console -vim config.ini -``` - -- 网页解析 - -```console -# 运行web_zh.py -python3 web_zh.py -``` - -- API - -```console -# 运行web_api.py -python3 web_api.py -``` - -- 调用解析库 - -```python -# pip install DT-Scraper -from DT_scraper.scraper import Scraper - -api = Scraper() - -# 解析Douyin视频/图集 -douyin_data = api.douyin(input('抖音视频链接:')) -# 返回字典 -print(douyin_data) - -# Parsing TikTok Videos/Galleries -tiktok_data = api.tiktok(input('TikTok video URL:')) -# return dictionary -print(tiktok_data) - -# 使用代理进行解析(Parse using a proxy) -api.tiktok(input('TikTok video URL:'), proxies = {"all": "127.0.0.1:2333"}) - -``` - -- 入口(端口可在config.ini文件中修改) - -```text -网页入口: -http://localhost(服务器IP):5000/ -API入口: -http://localhost(服务器IP):2333/ -``` - -## 🗺️支持的提交格式(包含但不仅限于以下例子): - -- 抖音分享口令 (APP内复制) - -```text -例子:7.43 pda:/ 让你在几秒钟之内记住我 https://v.douyin.com/L5pbfdP/ 复制此链接,打开Dou音搜索,直接观看视频! -``` - -- 抖音短网址 (APP内复制) - -```text -例子:https://v.douyin.com/L4FJNR3/ -``` - -- 抖音正常网址 (网页版复制) - -```text -例子: -https://www.douyin.com/video/6914948781100338440 -``` - -- 抖音发现页网址 (APP复制) - -```text -例子: -https://www.douyin.com/discover?modal_id=7069543727328398622 -``` - -- TikTok短网址 (APP内复制) - -```text -例子: -https://vm.tiktok.com/TTPdkQvKjP/ -``` - -- TikTok正常网址 (网页版复制) - -```text -例子: -https://www.tiktok.com/@tvamii/video/7045537727743380782 -``` - -- 抖音/TikTok批量网址(无需使用符合隔开) - -```text -例子: -2.84 nqe:/ 骑白马的也可以是公主%%百万转场变身 https://v.douyin.com/L4FJNR3/ 复制此链接,打开Dou音搜索,直接观看视频! -8.94 mDu:/ 让你在几秒钟之内记住我 https://v.douyin.com/L4NpDJ6/ 复制此链接,打开Dou音搜索,直接观看视频! -9.94 LWz:/ ok我坦白交代 %%knowknow https://v.douyin.com/L4NEvNn/ 复制此链接,打开Dou音搜索,直接观看视频! -https://www.tiktok.com/@gamer/video/7054061777033628934 -https://www.tiktok.com/@off.anime_rei/video/7059609659690339586 -https://www.tiktok.com/@tvamii/video/7045537727743380782 -``` - -## 🛰️API使用 - -API可将请求参数转换为需要提取的无水印视频/图片直链,配合IOS捷径可实现应用内下载。 - -- 解析请求参数 - -```text -http://localhost(服务器IP):2333/api?url="复制的(抖音/TikTok)口令/链接" -``` - -- 返回参数 - -> 抖音视频 - -```json -{ - "analyze_time": "1.9043s", - "api_url": "https://www.iesdouyin.com/web/api/v2/aweme/iteminfo/?item_ids=6918273131559881997", - "nwm_video_url": "http://v3-dy-o.zjcdn.com/23f0dec312ede563bef881af9a88bdc7/624dd965/video/tos/cn/tos-cn-ve-15/eccedcf4386948f5b5a1f0bcfb3dcde9/?a=1128&br=2537&bt=2537&cd=0%7C0%7C0%7C0&ch=0&cr=0&cs=0&cv=1&dr=0&ds=3&er=&ft=sYGC~3E7nz7Th1PZSDXq&l=202204070118030102080650132A21E31F&lr=&mime_type=video_mp4&net=0&pl=0&qs=0&rc=M3hleDRsODlkMzMzaGkzM0ApODpmNWc4ODs5N2lmNzg5aWcpaGRqbGRoaGRmLi4ybnBrbjYuYC0tYy0wc3MtYmJjNTM2NjAtNDFjMzJgOmNwb2wrbStqdDo%3D&vl=&vr=", - "original_url": "https://v.douyin.com/L4FJNR3/", - "platform": "douyin", - "status": "success", - "url_type": "video", - "video_author": "Real机智张", - "video_author_id": "Rea1yaoyue", - "video_author_signature": "", - "video_author_uid": "59840491348", - "video_aweme_id": "6918273131559881997", - "video_comment_count": "89145", - "video_create_time": "1610786002", - "video_digg_count": "2968195", - "video_hashtags": [ - "百万转场变身" - ], - "video_music": "https://sf3-cdn-tos.douyinstatic.com/obj/ies-music/6910889805266504461.mp3", - "video_music_author": "梅尼耶", - "video_music_id": "6910889820861451000", - "video_music_mid": "6910889820861451021", - "video_music_title": "@梅尼耶创作的原声", - "video_play_count": "0", - "video_share_count": "74857", - "video_title": "骑白马的也可以是公主#百万转场变身", - "wm_video_url": "https://aweme.snssdk.com/aweme/v1/playwm/?video_id=v0300ffe0000c01a96q5nis1qu5b1u10&ratio=720p&line=0" -} -``` - -> 抖音图集 - -```json -{ - "album_author": "治愈图集", - "album_author_id": "ZYTJ2002", - "album_author_signature": "取无水印图", - "album_author_uid": "449018054867063", - "album_aweme_id": "7015137063141920030", - "album_comment_count": "5436", - "album_create_time": "1633338878", - "album_digg_count": "193734", - "album_hashtags": [ - "晚霞", - "治愈系", - "落日余晖", - "日落🌄" - ], - "album_list": [ - "https://p26-sign.douyinpic.com/tos-cn-i-0813/5223757a7bef4f8480cd25d0fa2d2d94~noop.webp?x-expires=1651856400&x-signature=K1VjJdWTHCAaYSz14y6NumjjtfI%3D&from=4257465056&s=PackSourceEnum_DOUYIN_REFLOW&se=false&biz_tag=aweme_images&l=202204070120460102101050412A210A47", - "https://p26-sign.douyinpic.com/tos-cn-i-0813/d99467672da840908acccf2d2b4b7ef7~noop.webp?x-expires=1651856400&x-signature=ncBb8Tt7z4PmpUyiCNr%2FJYnwRSA%3D&from=4257465056&s=PackSourceEnum_DOUYIN_REFLOW&se=false&biz_tag=aweme_images&l=202204070120460102101050412A210A47", - "https://p26-sign.douyinpic.com/tos-cn-i-0813/5c2562210b1a4d4c99d6d4dbd2f23f2b~noop.webp?x-expires=1651856400&x-signature=Rsmplb53IKfvKd3mmIb4iQNhlIE%3D&from=4257465056&s=PackSourceEnum_DOUYIN_REFLOW&se=false&biz_tag=aweme_images&l=202204070120460102101050412A210A47", - "https://p26-sign.douyinpic.com/tos-cn-i-0813/9bb74c0c6aff4217bd1491a077b2c817~noop.webp?x-expires=1651856400&x-signature=BLRyHoKP0ybIci57yneOca62dxI%3D&from=4257465056&s=PackSourceEnum_DOUYIN_REFLOW&se=false&biz_tag=aweme_images&l=202204070120460102101050412A210A47" - ], - "album_music": "https://sf6-cdn-tos.douyinstatic.com/obj/ies-music/6978805801733442341.mp3", - "album_music_author": "魏同学", - "album_music_id": "6978805810365271000", - "album_music_mid": "6978805810365270791", - "album_music_title": "@魏同学创作的原声", - "album_play_count": "0", - "album_share_count": "30717", - "album_title": "“山海自有归期 风雨自有相逢 意难平终将和解 万事终将如意”#晚霞 #治愈系 #落日余晖 #日落🌄", - "analyze_time": "1.0726s", - "api_url": "https://www.iesdouyin.com/web/api/v2/aweme/iteminfo/?item_ids=7015137063141920030", - "original_url": "https://v.douyin.com/Nb8jysN/", - "platform": "douyin", - "status": "success", - "url_type": "album" -} -``` - -> TikTok视频 - -```JSON -{ - "analyze_time": "5.0863s", - "nwm_video_url": "https://v19.tiktokcdn-us.com/cfa357dadd8f913f013a6d0b0dca293f/624e20fa/video/tos/useast5/tos-useast5-ve-0068c003-tx/3296231486014755a1b81aa70c349a53/?a=1233&br=6498&bt=3249&cd=0%7C0%7C0%7C3&ch=0&cr=3&cs=0&cv=1&dr=0&ds=6&er=&ft=bY1KJnB4TJBS6BMy-L1iVKP&l=20220406172333010113135214232FAB56&lr=all&mime_type=video_mp4&net=0&pl=0&qs=0&rc=MzpsaGY6Zjo7PDMzZzczNEApNjY6ZTtkOzxpN2Q3PDo5OmdgZ2BtcjQwai9gLS1kMS9zczJhLTEzYjEuMTJeXzQyLmM6Yw%3D%3D&vl=&vr=", - "original_url": "https://www.tiktok.com/@oregonzoo/video/7080938094823738666", - "platform": "tiktok", - "status": "success", - "url_type": "video", - "video_author": "oregonzoo", - "video_author_SecId": "MS4wLjABAAAArWNQ8-AZN6CxWOkqdeWsMBUuLDmJt8TWUAk0S4aWDW5V5EoqRbuczhaLnxJHCGob", - "video_author_diggCount": 94, - "video_author_followerCount": 1800000, - "video_author_followingCount": 39, - "video_author_heartCount": 29700000, - "video_author_id": "6699816060206171141", - "video_author_nickname": "Oregon Zoo", - "video_author_videoCount": 264, - "video_aweme_id": "7080938094823738666", - "video_comment_count": 61, - "video_create_time": "1648659375", - "video_digg_count": 11800, - "video_hashtags": [ - "redpanda", - "boop", - "sunshine" - ], - "video_music": "https://sf16.tiktokcdn-us.com/obj/ies-music-tx/7075363935741856558.mp3", - "video_music_author": "Gilderoy Dauterive", - "video_music_id": "7075363884613356330", - "video_music_title": "Be the Sunshine", - "video_music_url": "https://sf16.tiktokcdn-us.com/obj/ies-music-tx/7075363935741856558.mp3", - "video_play_count": 60100, - "video_ratio": "720p", - "video_share_count": 298, - "video_title": "Moshu ✨ #redpanda #boop #sunshine", - "wm_video_url": "https://v16m-webapp.tiktokcdn-us.com/0394b9183a5852d4392a7e804bf78c55/624e20f6/video/tos/useast5/tos-useast5-ve-0068c001-tx/fc63ae232e70466398b55ccf97eb3c67/?a=1988&br=6468&bt=3234&cd=0%7C0%7C1%7C0&ch=0&cr=0&cs=0&cv=1&dr=0&ds=3&er=&ft=XY53A3E7nz7Th-pZSDXq&l=202204061723290101131351171341B9BB&lr=tiktok_m&mime_type=video_mp4&net=0&pl=0&qs=0&rc=MzpsaGY6Zjo7PDMzZzczNEApOjo4aDMzZmRlN2loOWk6ZWdgZ2BtcjQwai9gLS1kMS9zczBhNGA0LTIwNjNiYDQ2YmE6Yw%3D%3D&vl=&vr=" -} -``` - -- 下载视频请求参数 - -```text -http://localhost(服务器IP):2333/video?url="复制的(抖音/TikTok)口令/链接" -# 返回无水印mp4文件 -``` - -- 下载音频请求参数 - -```text -http://localhost(服务器IP):2333/music?url="复制的(抖音/TikTok)口令/链接" -# 返回mp3文件 -``` - ---- - -## 💾部署(方式一 手动部署) - -> 注: -> 截图可能因更新问题与文字不符,一切请优先参照文字叙述。 - -> 最好将本项目部署至海外服务器(优先选择美国地区的服务器),否则可能会出现奇怪的问题。 - -例子: -项目部署在国内服务器,而人在美国,点击结果页面链接报错403 ,目测与抖音CDN有关系。 -项目部署在韩国服务器,解析TikTok报错 ,目测TikTok对某些地区或IP进行了限制。 - -> 使用宝塔Linux面板进行部署( -> 中文宝塔要强制绑定手机号了,很流氓且无法绕过,建议使用宝塔国际版,谷歌搜索关键字aapanel自行安装,部署步骤相似。) - -- 首先要去安全组开放5000和2333端口(Web默认5000,API默认2333,可以在文件config.ini中修改。) -- 在宝塔应用商店内搜索python并安装项目管理器 (推荐使用1.9版本) - -![](https://raw.githubusercontent.com/Evil0ctal/TikTokDownloader_PyWebIO/main/Screenshots/BT_Linux_Panel_Deploy_1.png) - ---- - -- 创建一个项目名字随意 -- 路径选择你上传文件的路径 -- Python版本需要至少3以上(在左侧版本管理中自行安装) -- 框架修改为`Flask` -- 启动方式修改为`python` -- Web启动文件选择`web_zh.py` -- API启动文件选择`web_api.py` -- 勾选安装模块依赖 -- 开机启动随意 -- 如果宝塔运行了`Nginx`等其他服务时请自行判断端口是否被占用,运行端口可在文件config.ini中修改。 - -![](https://raw.githubusercontent.com/Evil0ctal/TikTokDownloader_PyWebIO/main/Screenshots/BT_Linux_Panel_Deploy_2.png) - -- 如果有大量请求请使用进程守护启动防止进程关闭 - ---- - -## 💾部署(方式二 Docker) - -- 安装docker - -```yaml -curl -fsSL get.docker.com -o get-docker.sh&&sh get-docker.sh &&systemctl enable docker&&systemctl start docker -``` - -- 留下config.int和docker-compose.yml文件即可 -- 运行命令,让容器在后台运行 - -```yaml -docker compose up -d -``` - -- 查看容器日志 - -```yaml -docker logs -f douyin_tiktok_download_api -``` - -- 删除容器 - -```yaml -docker rm -f douyin_tiktok_download_api -``` - -- 更新 - -```yaml -docker compose pull && docker compose down && docker compose up -d -``` - -## ❤️ 贡献者 - -[![](https://github.com/Evil0ctal.png?size=50)](https://github.com/Evil0ctal) -[![](https://github.com/jw-star.png?size=50)](https://github.com/jw-star) -[![](https://github.com/Jeffrey-deng.png?size=50)](https://github.com/Jeffrey-deng) -[![](https://github.com/chris-ss.png?size=50)](https://github.com/chris-ss) -[![](https://github.com/weixuan00.png?size=50)](https://github.com/weixuan00) -[![](https://github.com/Tairraos.png?size=50)](https://github.com/Tairraos) - -## 🎉截图 - -> 注: -> 截图可能因更新问题与文字不符,一切请优先参照文字叙述。 - -
点击展开截图 - -
- -- 主界面 - -![](https://github.com/Evil0ctal/TikTokDownloader_PyWebIO/blob/main/Screenshots/home.png) - ---- - -- 解析完成 - -> 单个 - -![](https://github.com/Evil0ctal/TikTokDownloader_PyWebIO/blob/main/Screenshots/single_result.png) - ---- - -> 批量 - -![](https://github.com/Evil0ctal/TikTokDownloader_PyWebIO/blob/main/Screenshots/multi_results.png) - ---- - -- API提交/返回 - -> 视频返回值 - -![](https://github.com/Evil0ctal/TikTokDownloader_PyWebIO/blob/main/Screenshots/api_video_result.png) - -> 图集返回值 - -![](https://github.com/Evil0ctal/TikTokDownloader_PyWebIO/blob/main/Screenshots/api_image_result.png) - -> TikTok返回值 - -![](https://raw.githubusercontent.com/Evil0ctal/TikTokDownloader_PyWebIO/main/Screenshots/tiktok_API.png) - ---- - -
- -## :alembic: 技术栈 - -* [PyWebIO](https://www.pyweb.io/) + [Flask](https://flask.palletsprojects.com/) - -## :scroll: 许可证 - -MIT License - ---- -> GitHub [@Evil0ctal](https://github.com/Evil0ctal)  ·  -> Email Evil0ctal1985@gmail.com diff --git a/PyPi/dist/DT_Scraper-1.0.0.tar.gz b/PyPi/dist/DT_Scraper-1.0.0.tar.gz deleted file mode 100644 index 15e74b5..0000000 Binary files a/PyPi/dist/DT_Scraper-1.0.0.tar.gz and /dev/null differ diff --git a/PyPi/dist/DT_Scraper-1.0.1.tar.gz b/PyPi/dist/DT_Scraper-1.0.1.tar.gz deleted file mode 100644 index 9a9db81..0000000 Binary files a/PyPi/dist/DT_Scraper-1.0.1.tar.gz and /dev/null differ diff --git a/PyPi/pyproject.toml b/PyPi/pyproject.toml deleted file mode 100644 index 0ad39d0..0000000 --- a/PyPi/pyproject.toml +++ /dev/null @@ -1,6 +0,0 @@ -[build-system] -requires = [ - "setuptools>=42", - "wheel" -] -build-backend = "setuptools.build_meta" \ No newline at end of file diff --git a/PyPi/setup.py b/PyPi/setup.py deleted file mode 100644 index 8307f36..0000000 --- a/PyPi/setup.py +++ /dev/null @@ -1,53 +0,0 @@ -#! /usr/bin/env python -# -*- coding: utf-8 -*- -# RUN Command Line: -# python3 setup.py sdist (Build-check dist folder) -# python3 -m twine upload --repository pypi dist/* (Upload to PyPi) - -try: - from setuptools import setup -except ImportError: - from distutils.core import setup -import setuptools - -setup( - name='DT_Scraper', # 包的名字 - author='Evil0ctal', # 作者 - version='1.0.1', # 版本号 - license='MIT', - - description='Douyin/TikTok crawler and no watermark video download.', # 描述 - long_description='''Douyin/TikTok crawler and no watermark video download.''', - author_email='evil0ctal1985@gmail.com', # 你的邮箱** - url='https://github.com/Evil0ctal/Douyin_TikTok_Download_API', # 可以写github上的地址,或者其他地址 - # 包内需要引用的文件夹 - # packages=setuptools.find_packages(exclude=['url2io',]), - packages=["src/DT_scraper"], - # keywords='NLP,tokenizing,Chinese word segementation', - # package_dir={'jieba':'jieba'}, - # package_data={'jieba':['*.*','finalseg/*','analyse/*','posseg/*']}, - - # 依赖包 - install_requires=[ - 'requests', - "tenacity", - ], - classifiers=[ - # 'Development Status :: 4 - Beta', - # 'Operating System :: Microsoft' # 你的操作系统 OS Independent Microsoft - 'Intended Audience :: Developers', - # 'License :: OSI Approved :: MIT License', - # 'License :: OSI Approved :: BSD License', # BSD认证 - 'Programming Language :: Python', # 支持的语言 - 'Programming Language :: Python :: 3', # python版本 。。。 - 'Programming Language :: Python :: 3.4', - 'Programming Language :: Python :: 3.5', - 'Programming Language :: Python :: 3.6', - 'Programming Language :: Python :: 3.7', - 'Programming Language :: Python :: 3.8', - 'Programming Language :: Python :: 3.9', - 'Programming Language :: Python :: 3.10', - 'Topic :: Software Development :: Libraries' - ], - zip_safe=True, -) diff --git a/PyPi/src/DT_Scraper.egg-info/PKG-INFO b/PyPi/src/DT_Scraper.egg-info/PKG-INFO deleted file mode 100644 index d29f4ce..0000000 --- a/PyPi/src/DT_Scraper.egg-info/PKG-INFO +++ /dev/null @@ -1,27 +0,0 @@ -Metadata-Version: 2.1 -Name: DT-Scraper -Version: 1.0.0 -Summary: Douyin/TikTok crawler and no watermark video download. -Home-page: https://github.com/Evil0ctal/Douyin_TikTok_Download_API -Author: Evil0ctal -Author-email: evil0ctal1985@gmail.com -License: MIT -Project-URL: Bug Tracker, https://github.com/Evil0ctal/Douyin_TikTok_Download_API/issues -Platform: UNKNOWN -Classifier: Intended Audience :: Developers -Classifier: Programming Language :: Python -Classifier: Programming Language :: Python :: 3 -Classifier: Programming Language :: Python :: 3.4 -Classifier: Programming Language :: Python :: 3.5 -Classifier: Programming Language :: Python :: 3.6 -Classifier: Programming Language :: Python :: 3.7 -Classifier: Programming Language :: Python :: 3.8 -Classifier: Programming Language :: Python :: 3.9 -Classifier: Programming Language :: Python :: 3.10 -Classifier: Topic :: Software Development :: Libraries -Requires-Python: >=3.6 -Description-Content-Type: text/markdown -License-File: LICENSE - -Douyin/TikTok crawler and no watermark video download. - diff --git a/PyPi/src/DT_Scraper.egg-info/SOURCES.txt b/PyPi/src/DT_Scraper.egg-info/SOURCES.txt deleted file mode 100644 index 4d821d6..0000000 --- a/PyPi/src/DT_Scraper.egg-info/SOURCES.txt +++ /dev/null @@ -1,13 +0,0 @@ -LICENSE -README.md -pyproject.toml -setup.cfg -setup.py -src/DT_Scraper.egg-info/PKG-INFO -src/DT_Scraper.egg-info/SOURCES.txt -src/DT_Scraper.egg-info/dependency_links.txt -src/DT_Scraper.egg-info/requires.txt -src/DT_Scraper.egg-info/top_level.txt -src/DT_Scraper.egg-info/zip-safe -src/DT_scraper/__init__.py -src/DT_scraper/scraper.py \ No newline at end of file diff --git a/PyPi/src/DT_Scraper.egg-info/dependency_links.txt b/PyPi/src/DT_Scraper.egg-info/dependency_links.txt deleted file mode 100644 index 8b13789..0000000 --- a/PyPi/src/DT_Scraper.egg-info/dependency_links.txt +++ /dev/null @@ -1 +0,0 @@ - diff --git a/PyPi/src/DT_Scraper.egg-info/requires.txt b/PyPi/src/DT_Scraper.egg-info/requires.txt deleted file mode 100644 index 365ee77..0000000 --- a/PyPi/src/DT_Scraper.egg-info/requires.txt +++ /dev/null @@ -1,2 +0,0 @@ -requests -tenacity diff --git a/PyPi/src/DT_Scraper.egg-info/top_level.txt b/PyPi/src/DT_Scraper.egg-info/top_level.txt deleted file mode 100644 index 742d7f0..0000000 --- a/PyPi/src/DT_Scraper.egg-info/top_level.txt +++ /dev/null @@ -1 +0,0 @@ -DT_scraper diff --git a/PyPi/src/DT_Scraper.egg-info/zip-safe b/PyPi/src/DT_Scraper.egg-info/zip-safe deleted file mode 100644 index d3f5a12..0000000 --- a/PyPi/src/DT_Scraper.egg-info/zip-safe +++ /dev/null @@ -1 +0,0 @@ - diff --git a/PyPi/src/DT_scraper/__init__.py b/PyPi/src/DT_scraper/__init__.py deleted file mode 100644 index e69de29..0000000 diff --git a/PyPi/src/DT_scraper/requirements.txt b/PyPi/src/DT_scraper/requirements.txt deleted file mode 100644 index 60444cf..0000000 --- a/PyPi/src/DT_scraper/requirements.txt +++ /dev/null @@ -1,2 +0,0 @@ -requests==2.28.0 -tenacity==8.0.1 \ No newline at end of file diff --git a/PyPi/src/DT_scraper/scraper.py b/PyPi/src/DT_scraper/scraper.py deleted file mode 100644 index 64ea7ba..0000000 --- a/PyPi/src/DT_scraper/scraper.py +++ /dev/null @@ -1,552 +0,0 @@ -#!/usr/bin/env python -# -*- encoding: utf-8 -*- -# @Author: https://github.com/Evil0ctal/ -# @Time: 2021/11/06 -# @Update: 2022/09/04 -# @Function: -# 核心代码,估值1块(๑•̀ㅂ•́)و✧ -# 用于爬取Douyin/TikTok数据并以字典形式返回。 -# input link, output dictionary. - - -import re -import json -import requests -from tenacity import * - - -class Scraper: - """ - Scraper.douyin(link): - 输入参数为抖音视频/图集链接,完成解析后返回字典。 - - Scraper.tiktok(link): - 输入参数为TikTok视频/图集链接,完成解析后返回字典。 - """ - - def __init__(self): - self.headers = { - 'user-agent': 'Mozilla/5.0 (Linux; Android 8.0; Pixel 2 Build/OPD3.170816.012) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Mobile Safari/537.36 Edg/87.0.664.66' - } - self.tiktok_headers = { - "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9", - "authority": "www.tiktok.com", - "Accept-Encoding": "gzip, deflate", - "Connection": "keep-alive", - "Host": "www.tiktok.com", - "User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) coc_coc_browser/86.0.170 Chrome/80.0.3987.170 Safari/537.36", - } - - @retry(stop=stop_after_attempt(3), wait=wait_random(min=1, max=2)) - def douyin(self, original_url: str, proxies: dict = None): - """ - 利用官方接口解析抖音链接信息 - :param proxies: pip install DT-Scraper, Default not use proxy. - :param original_url: 抖音/TikTok链接(支持长/短链接) TikTok&Douyin URL - :return:包含信息的字典 Dictionary data - """ - headers = self.headers - try: - # 开始时间 - start = time.time() - # 判断是否为个人主页链接 - if 'user' in original_url: - return {'status': 'failed', 'reason': '暂不支持个人主页批量解析', 'function': 'Scraper.douyin()', - 'value': original_url} - else: - # 原视频链接 - r = requests.get(url=original_url, headers=headers, allow_redirects=False, proxies=proxies) - try: - # 2021/12/11 发现抖音做了限制,会自动重定向网址,但是可以从回执头中获取 - long_url = r.headers['Location'] - # 判断是否为个人主页链接 - if 'user' in long_url: - return {'status': 'failed', 'reason': '暂不支持个人主页批量解析', - 'function': 'Scraper.douyin()', - 'value': original_url} - except: - # 报错后判断为长链接,直接截取视频id - long_url = original_url - # 正则匹配出视频ID - try: - # 第一种链接类型 - # https://www.douyin.com/video/7086770907674348841 - key = re.findall('/video/(\d+)?', long_url)[0] - print('视频ID为: {}'.format(key)) - except Exception: - # 第二种链接类型 - # https://www.douyin.com/discover?modal_id=7086770907674348841 - key = re.findall('modal_id=(\d+)', long_url)[0] - print('视频ID为: {}'.format(key)) - # 构造抖音API链接 - api_url = f'https://www.iesdouyin.com/web/api/v2/aweme/iteminfo/?item_ids={key}' - print("正在请求抖音API链接: " + '\n' + api_url) - # 将回执以JSON格式处理 - js = json.loads(requests.get(url=api_url, headers=headers, proxies=proxies).text) - aweme_id = str(js['item_list'][0]['aweme_id']) - share_url = re.sub("/\\?.*", "", js['item_list'][0]['share_url']) - if share_url is None: - share_url = ( - "https://www.iesdouyin.com/share/video/" + aweme_id) if aweme_id is not None else original_url; - try: - music_share_url = "https://www.iesdouyin.com/share/music/" + str(js['item_list'][0]['music']['mid']) - except: - music_share_url = None - # 判断是否为图集 - if js['item_list'][0]['images'] is not None: - print("类型 = 图集") - # 类型为图集 - url_type = 'album' - # 图集标题 - album_title = str(js['item_list'][0]['desc']) - # 图集作者昵称 - album_author = str(js['item_list'][0]['author']['nickname']) - # 图集作者签名 - album_author_signature = str(js['item_list'][0]['author']['signature']) - # 图集作者UID - album_author_uid = str(js['item_list'][0]['author']['uid']) - # 图集作者抖音号 - album_author_id = str(js['item_list'][0]['author']['unique_id']) - if album_author_id == "": - # 如果作者未修改过抖音号,应使用此值以避免无法获取其抖音ID - album_author_id = str(js['item_list'][0]['author']['short_id']) - # 尝试获取图集BGM信息 - for key in js['item_list'][0]: - if key == 'music': - # 图集BGM链接 - album_music = str(js['item_list'][0]['music']['play_url']['url_list'][0] if len( - js['item_list'][0]['music']['play_url']['url_list']) > 0 else 'No BGM found') - # 图集BGM标题 - album_music_title = str(js['item_list'][0]['music']['title']) - # 图集BGM作者 - album_music_author = str(js['item_list'][0]['music']['author']) - # 图集BGM ID - album_music_id = str(js['item_list'][0]['music']['id']) - # 图集BGM MID - album_music_mid = str(js['item_list'][0]['music']['mid']) - break; - else: - # 图集BGM链接 - album_music = album_music_title = album_music_author = album_music_id = album_music_mid = 'No BGM found ' - # 图集ID - album_aweme_id = str(js['item_list'][0]['statistics']['aweme_id']) - # 评论数量 - album_comment_count = str(js['item_list'][0]['statistics']['comment_count']) - # 获赞数量 - album_digg_count = str(js['item_list'][0]['statistics']['digg_count']) - # 播放次数 - album_play_count = str(js['item_list'][0]['statistics']['play_count']) - # 分享次数 - album_share_count = str(js['item_list'][0]['statistics']['share_count']) - # 上传时间戳 - album_create_time = str(js['item_list'][0]['create_time']) - # 将话题保存在列表中 - album_hashtags = [] - for tag in js['item_list'][0]['text_extra']: - album_hashtags.append(tag['hashtag_name']) - # 将无水印图片链接保存在列表中 - images_list = [] - for data in js['item_list'][0]['images']: - images_list.append(data['url_list'][0]) - # 结束时间 - end = time.time() - # 解析时间 - analyze_time = format((end - start), '.4f') - # 将信息储存在字典中 - album_data = {'status': 'success', - 'analyze_time': (analyze_time + 's'), - 'url_type': url_type, - 'platform': 'douyin', - 'original_url': original_url, - 'share_url': share_url, - 'music_share_url': music_share_url, - 'api_url': api_url, - 'album_aweme_id': album_aweme_id, - 'album_title': album_title, - 'album_author': album_author, - 'album_author_signature': album_author_signature, - 'album_author_uid': album_author_uid, - 'album_author_id': album_author_id, - 'album_music': album_music, - 'album_music_title': album_music_title, - 'album_music_author': album_music_author, - 'album_music_id': album_music_id, - 'album_music_mid': album_music_mid, - 'album_comment_count': album_comment_count, - 'album_digg_count': album_digg_count, - 'album_play_count': album_play_count, - 'album_share_count': album_share_count, - 'album_create_time': album_create_time, - 'album_list': images_list, - 'album_hashtags': album_hashtags} - return album_data - else: - print("类型 = 视频") - # 类型为视频 - url_type = 'video' - # 视频标题 - video_title = str(js['item_list'][0]['desc']) - # 视频作者昵称 - video_author = str(js['item_list'][0]['author']['nickname']) - # 视频作者抖音号 - video_author_id = str(js['item_list'][0]['author']['unique_id']) - if video_author_id == "": - # 如果作者未修改过抖音号,应使用此值以避免无法获取其抖音ID - video_author_id = str(js['item_list'][0]['author']['short_id']) - # vid - vid = str(js['item_list'][0]['video']['vid']) - # 无水印1080p视频链接 - try: - r = requests.get( - "https://aweme.snssdk.com/aweme/v1/play/?video_id={}&radio=1080p&line=0".format(vid), - headers=headers, allow_redirects=False, proxies=proxies) - nwm_video_url_1080p = r.headers['Location'] - except: - nwm_video_url_1080p = "None" - # 有水印视频链接 - wm_video_url = str(js['item_list'][0]['video']['play_addr']['url_list'][0]) - # 无水印视频链接 (在回执JSON中将关键字'playwm'替换为'play'即可获得无水印地址) - nwm_video_url = str(js['item_list'][0]['video']['play_addr']['url_list'][0]).replace('playwm', - 'play') - # 去水印后视频链接(2022年1月1日抖音APi获取到的URL会进行跳转,需要在Location中获取直链) - r = requests.get(url=nwm_video_url, headers=headers, allow_redirects=False, proxies=proxies) - video_url = r.headers['Location'] - # 视频作者签名 - video_author_signature = str(js['item_list'][0]['author']['signature']) - # 视频作者UID - video_author_uid = str(js['item_list'][0]['author']['uid']) - # 尝试获取视频背景音乐 - for key in js['item_list'][0]: - if key == 'music': - if len(js['item_list'][0]['music']['play_url']['url_list']) != 0: - # 视频BGM链接 - video_music = str(js['item_list'][0]['music']['play_url']['url_list'][0]) - else: - video_music = 'No BGM found' - # 视频BGM标题 - video_music_title = str(js['item_list'][0]['music']['title']) - # 视频BGM作者 - video_music_author = str(js['item_list'][0]['music']['author']) - # 视频BGM ID - video_music_id = str(js['item_list'][0]['music']['id']) - # 视频BGM MID - video_music_mid = str(js['item_list'][0]['music']['mid']) - break; - else: - video_music = video_music_title = video_music_author = video_music_id = video_music_mid = 'No BGM found' - # 视频ID - video_aweme_id = str(js['item_list'][0]['statistics']['aweme_id']) - # 评论数量 - video_comment_count = str(js['item_list'][0]['statistics']['comment_count']) - # 获赞数量 - video_digg_count = str(js['item_list'][0]['statistics']['digg_count']) - # 播放次数 - video_play_count = str(js['item_list'][0]['statistics']['play_count']) - # 分享次数 - video_share_count = str(js['item_list'][0]['statistics']['share_count']) - # 上传时间戳 - video_create_time = str(js['item_list'][0]['create_time']) - # 视频封面 - video_cover = js['item_list'][0]['video']['cover']['url_list'][0] - # 视频动态封面 - video_dynamic_cover = js['item_list'][0]['video']['dynamic_cover']['url_list'][0] - # 视频原始封面 - video_origin_cover = js['item_list'][0]['video']['origin_cover']['url_list'][0] - # 将话题保存在列表中 - video_hashtags = [] - for tag in js['item_list'][0]['text_extra']: - video_hashtags.append(tag['hashtag_name']) - # 结束时间 - end = time.time() - # 解析时间 - analyze_time = format((end - start), '.4f') - # 返回包含数据的字典 - video_data = {'status': 'success', - 'analyze_time': (analyze_time + 's'), - 'url_type': url_type, - 'platform': 'douyin', - 'original_url': original_url, - 'share_url': share_url, - 'music_share_url': music_share_url, - 'api_url': api_url, - 'video_title': video_title, - 'nwm_video_url': video_url, - 'nwm_video_url_1080p': nwm_video_url_1080p, - 'wm_video_url': wm_video_url, - 'video_aweme_id': video_aweme_id, - 'video_author': video_author, - 'video_author_signature': video_author_signature, - 'video_author_uid': video_author_uid, - 'video_author_id': video_author_id, - 'video_music': video_music, - 'video_music_title': video_music_title, - 'video_music_author': video_music_author, - 'video_music_id': video_music_id, - 'video_music_mid': video_music_mid, - 'video_comment_count': video_comment_count, - 'video_digg_count': video_digg_count, - 'video_play_count': video_play_count, - 'video_share_count': video_share_count, - 'video_create_time': video_create_time, - 'video_cover': video_cover, - 'video_dynamic_cover': video_dynamic_cover, - 'video_origin_cover': video_origin_cover, - 'video_hashtags': video_hashtags} - return video_data - except Exception as e: - # 返回异常 - return {'status': 'failed', 'reason': e, 'function': 'Scraper.douyin()', 'value': original_url} - - @retry(stop=stop_after_attempt(3), wait=wait_random(min=1, max=2)) - def tiktok(self, original_url: str, proxies: dict = None): - """ - 解析TikTok链接 - :param proxies: {'all': 127.0.0.1:2333}, Default not use proxy. - :param original_url:TikTok链接 - :return:包含信息的字典 - """ - - headers = self.headers - # 开始时间 - start = time.time() - # 校验TikTok链接 - if '@' in original_url: - print("目标链接: ", original_url) - else: - # 从请求头中获取原始链接 - response = requests.get(url=original_url, headers=headers, allow_redirects=False, proxies=proxies) - true_link = response.headers['Location'].split("?")[0] - original_url = true_link - # TikTok请求头返回的第二种链接类型 - if '.html' in true_link: - response = requests.get(url=true_link, headers=headers, allow_redirects=False, proxies=proxies) - original_url = response.headers['Location'].split("?")[0] - print("目标链接: ", original_url) - try: - # 获取视频ID - video_id = re.findall('/video/(\d+)?', original_url)[0] - print('获取到的TikTok视频ID是{}'.format(video_id)) - # 尝试从TikTok网页获取部分视频数据 - try: - tiktok_headers = self.tiktok_headers - html = requests.get(url=original_url, headers=tiktok_headers, proxies=proxies, timeout=1) - # 正则检索网页中存在的JSON信息 - resp = re.search('"ItemModule":{(.*)},"UserModule":', html.text).group(1) - resp_info = ('{"ItemModule":{' + resp + '}}') - result = json.loads(resp_info) - # 从网页中获得的视频JSON数据 - video_info = result["ItemModule"][video_id] - except: - video_info = None - # 从TikTok官方API获取部分视频数据 - tiktok_api_link = 'https://api.tiktokv.com/aweme/v1/aweme/detail/?aweme_id={}'.format( - video_id) - print('正在请求API链接:{}'.format(tiktok_api_link)) - response = requests.get(url=tiktok_api_link, headers=headers, proxies=proxies).text - # 将API获取到的内容格式化为JSON - result = json.loads(response) - if 'image_post_info' in response: - # 判断链接是图集链接 - url_type = 'album' - print('类型为图集/type album') - # 视频标题 - album_title = result["aweme_detail"]["desc"] - # 视频作者昵称 - album_author_nickname = result["aweme_detail"]['author']["nickname"] - # 视频作者ID - album_author_id = result["aweme_detail"]['author']["unique_id"] - # 上传时间戳 - album_create_time = result["aweme_detail"]['create_time'] - # 视频ID - album_aweme_id = result["aweme_detail"]['statistics']['aweme_id'] - try: - # 视频BGM标题 - album_music_title = result["aweme_detail"]['music']['title'] - # 视频BGM作者 - album_music_author = result["aweme_detail"]['music']['author'] - # 视频BGM ID - album_music_id = result["aweme_detail"]['music']['id'] - # 视频BGM链接 - album_music_url = result["aweme_detail"]['music']['play_url']['url_list'][0] - except: - album_music_title, album_music_author, album_music_id, album_music_url = "None", "None", "None", "None" - # 评论数量 - album_comment_count = result["aweme_detail"]['statistics']['comment_count'] - # 获赞数量 - album_digg_count = result["aweme_detail"]['statistics']['digg_count'] - # 播放次数 - album_play_count = result["aweme_detail"]['statistics']['play_count'] - # 下载次数 - album_download_count = result["aweme_detail"]['statistics']['download_count'] - # 分享次数 - album_share_count = result["aweme_detail"]['statistics']['share_count'] - # 无水印图集 - album_list = [] - for i in result["aweme_detail"]['image_post_info']['images']: - album_list.append(i['display_image']['url_list'][0]) - # 结束时间 - end = time.time() - # 解析时间 - analyze_time = format((end - start), '.4f') - # 储存数据 - album_data = {'status': 'success', - 'analyze_time': (analyze_time + 's'), - 'url_type': url_type, - 'api_url': tiktok_api_link, - 'original_url': original_url, - 'platform': 'tiktok', - 'album_title': album_title, - 'album_list': album_list, - 'album_author_nickname': album_author_nickname, - 'album_author_id': album_author_id, - 'album_create_time': album_create_time, - 'album_aweme_id': album_aweme_id, - 'album_music_title': album_music_title, - 'album_music_author': album_music_author, - 'album_music_id': album_music_id, - 'album_music_url': album_music_url, - 'album_comment_count': album_comment_count, - 'album_digg_count': album_digg_count, - 'album_play_count': album_play_count, - 'album_share_count': album_share_count, - 'album_download_count': album_download_count - } - # 返回包含数据的字典 - return album_data - else: - # 类型为视频 - url_type = 'video' - print('类型为视频/type video') - # 无水印视频链接 - nwm_video_url = result["aweme_detail"]["video"]["play_addr"]["url_list"][0] - try: - # 有水印视频链接 - wm_video_url = result["aweme_detail"]["video"]['download_addr']['url_list'][0] - except Exception: - # 有水印视频链接 - wm_video_url = 'None' - # 视频标题 - video_title = result["aweme_detail"]["desc"] - # 视频作者昵称 - video_author_nickname = result["aweme_detail"]['author']["nickname"] - # 视频作者ID - video_author_id = result["aweme_detail"]['author']["unique_id"] - # 上传时间戳 - video_create_time = result["aweme_detail"]['create_time'] - # 视频ID - video_aweme_id = result["aweme_detail"]['statistics']['aweme_id'] - try: - # 视频BGM标题 - video_music_title = result["aweme_detail"]['music']['title'] - # 视频BGM作者 - video_music_author = result["aweme_detail"]['music']['author'] - # 视频BGM ID - video_music_id = result["aweme_detail"]['music']['id'] - # 视频BGM链接 - video_music_url = result["aweme_detail"]['music']['play_url']['url_list'][0] - except: - video_music_title, video_music_author, video_music_id, video_music_url = "None", "None", "None", "None" - # 评论数量 - video_comment_count = result["aweme_detail"]['statistics']['comment_count'] - # 获赞数量 - video_digg_count = result["aweme_detail"]['statistics']['digg_count'] - # 播放次数 - video_play_count = result["aweme_detail"]['statistics']['play_count'] - # 下载次数 - video_download_count = result["aweme_detail"]['statistics']['download_count'] - # 分享次数 - video_share_count = result["aweme_detail"]['statistics']['share_count'] - # 视频封面 - video_cover = result["aweme_detail"]['video']['cover']['url_list'][0] - # 视频动态封面 - video_dynamic_cover = result["aweme_detail"]['video']['dynamic_cover']['url_list'][0] - # 视频原始封面 - video_origin_cover = result["aweme_detail"]['video']['origin_cover']['url_list'][0] - # 将话题保存在列表中 - video_hashtags = [] - for tag in result["aweme_detail"]['text_extra']: - if 'hashtag_name' in tag: - video_hashtags.append(tag['hashtag_name']) - else: - continue - if video_info != None: - # 作者粉丝数量 - video_author_followerCount = video_info['authorStats']['followerCount'] - # 作者关注数量 - video_author_followingCount = video_info['authorStats']['followingCount'] - # 作者获赞数量 - video_author_heartCount = video_info['authorStats']['heartCount'] - # 作者视频数量 - video_author_videoCount = video_info['authorStats']['videoCount'] - # 作者已赞作品数量 - video_author_diggCount = video_info['authorStats']['diggCount'] - else: - # 作者粉丝数量 - video_author_followerCount = 'None' - # 作者关注数量 - video_author_followingCount = 'None' - # 作者获赞数量 - video_author_heartCount = 'None' - # 作者视频数量 - video_author_videoCount = 'None' - # 作者已赞作品数量 - video_author_diggCount = 'None' - # 结束时间 - end = time.time() - # 解析时间 - analyze_time = format((end - start), '.4f') - # 储存数据 - video_data = {'status': 'success', - 'analyze_time': (analyze_time + 's'), - 'url_type': url_type, - 'api_url': tiktok_api_link, - 'original_url': original_url, - 'platform': 'tiktok', - 'video_title': video_title, - 'nwm_video_url': nwm_video_url, - 'wm_video_url': wm_video_url, - 'video_author_nickname': video_author_nickname, - 'video_author_id': video_author_id, - 'video_create_time': video_create_time, - 'video_aweme_id': video_aweme_id, - 'video_music_title': video_music_title, - 'video_music_author': video_music_author, - 'video_music_id': video_music_id, - 'video_music_url': video_music_url, - 'video_comment_count': video_comment_count, - 'video_digg_count': video_digg_count, - 'video_play_count': video_play_count, - 'video_share_count': video_share_count, - 'video_download_count': video_download_count, - 'video_author_followerCount': video_author_followerCount, - 'video_author_followingCount': video_author_followingCount, - 'video_author_heartCount': video_author_heartCount, - 'video_author_videoCount': video_author_videoCount, - 'video_author_diggCount': video_author_diggCount, - 'video_cover': video_cover, - 'video_dynamic_cover': video_dynamic_cover, - 'video_origin_cover': video_origin_cover, - 'video_hashtags': video_hashtags - } - # 返回包含数据的字典 - return video_data - except Exception as e: - # 异常捕获 - return {'status': 'failed', 'reason': e, 'function': 'Scraper.tiktok()', 'value': original_url} - - -if __name__ == '__main__': - # 测试类 - scraper = Scraper() - while True: - url = re.findall('http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\(\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+', - input("Enter your Douyin/TikTok url here to test: "))[0] - try: - if 'douyin.com' in url: - douyin_date = scraper.douyin(url) - print(douyin_date) - else: - tiktok_date = scraper.tiktok(url) - print(tiktok_date) - except Exception as e: - print("Error: " + str(e))