1
0
mirror of https://git.sr.ht/~cadence/NewLeaf synced 2024-10-10 03:17:30 +00:00
Commit Graph

127 Commits

Author SHA1 Message Date
colgrave34
fde9f3272a Delete extra directory in Dockerfile 2024-07-14 23:09:55 +12:00
Cadence Ember
97b7661cc7 Fix feed channel IDs not starting with UC 2023-10-18 23:18:52 +13:00
b8499d3626 Fix ytInitialData extraction with new EU tracking consent cookie
Related yt-dlp fix: https://github.com/yt-dlp/yt-dlp/pull/7774
2023-08-27 22:18:04 +12:00
Cadence Ember
28511bdf96 change order of checks to fix KeyError bug with ytdlp 2023.6 2023-07-01 23:48:38 +12:00
Cadence Ember
f53dd28ada /channels/latest: author ID fallback in video entry 2023-03-27 21:17:27 +13:00
Cadence Ember
82f28cb99d Exclude m3u8 playlist formats 2023-03-13 01:00:19 +13:00
Cadence Ember
ecb4f4ccd1 Change http_dash_fragments detection based on yt-dlp translating adaptive streams to fragments
https://matrix.to/#/!vZmGzbsnHuVTLMItyg:cadence.moe/$VHp8D4XgW5jFbA6MhPaI52hZVMt_wpNy9gCSbF6Ot1Y?via=cadence.moe&via=matrix.org&via=envs.net
2023-03-13 00:36:08 +13:00
Cadence Ember
412b4934ed
Support new channel layout 2022-11-06 13:55:22 +13:00
Cadence Ember
714f1030fb
Channel path fixes I'm pretty sure I already did?
- use "channels" as default path, not "user"
- cache based on the combination of the path and the id
- fix channel latest
2022-09-12 23:13:37 +12:00
ac1aa07108 #29 Extract named channels using dynamic endpoint with second__path param instead of /user/ 2022-08-18 18:48:11 +12:00
29a3894337 #42 Return UNKNOWN error for not explicitly handled errors for channel extraction instead of stacktrace 2022-08-18 18:20:06 +12:00
Cadence Ember
c9b16d3efd
Explain bind_host in sample configuration 2022-05-22 00:57:13 +12:00
Cadence Ember
68cfbb809f
Remove with requests when it is unnecessary 2022-01-16 21:51:26 +13:00
Cadence Ember
73b4fbabf7
Do not actually write out pages. 2022-01-10 13:23:04 +13:00
Cadence Ember
36ae18c12f
Report errors when an account has been terminated 2022-01-10 13:02:08 +13:00
0a13ab88cb
Stream responses on /vi and /ggpht endpoints
The chunk_size=None parameter to iter_content lets us consume data as
soon as it arrives
https://docs.python-requests.org/en/master/api/#requests.Response.iter_content
2021-12-16 17:11:34 +13:00
bopol
66b7d1bec8
Fix regular captions
This removes all of the code that was previously used to get them from
/timedtext, and instead, always uses whatever is extracted from the
video page.

This does unfortunately now require a whole video fetch just for the
captions. But assuming captions are only requested by a frontend, this
won't be a problem due to the memory cache. The captions link will be
in memory because the just-requested video is in memory too.
2021-12-16 12:23:26 +13:00
Cadence Ember
550b633663
Do not extract storyboards into v/a formats 2021-12-08 22:33:48 +13:00
8a13868db7
Fix recommended videos extraction on some remaining IDs
Example: https://www.youtube.com/watch?v=-_GDl6cBebQ
2e9a445bc3/yt_dlp/extractor/common.py (L816)

Amends f22decbb
2021-11-12 17:16:14 +13:00
Cadence Ember
e3854a6050
Extract fact check notices to second__clarification 2021-11-04 02:01:52 +13:00
Cadence Ember
65bb7a2c4c
Fix recommended extraction when fact check notice 2021-11-04 01:59:50 +13:00
f22decbb74
Fix recommended videos extraction on IDs starting with - and _
Let's just leverage yt_dlp instead of rolling our own algorithms and fix
this kind of issue (not finding yt_dlp dump file for a given video) once
and for all

Example videos:
* https://www.youtube.com/watch?v=-q78QXpSL2M
* https://www.youtube.com/watch?v=_4SKG5uUEqs
2021-11-03 22:52:32 +13:00
Cadence Ember
2a0291cd5b
Update link to NewLeaf installation 2021-08-19 18:48:16 +12:00
Cadence Ember
d2df18ff75
Fix file detection and recommendations 2021-08-16 21:56:32 +12:00
Cadence Ember
e496ccb45a
Update blurb in README to reflect current state 2021-08-12 18:28:38 +12:00
d1f46d5269
Fix link to documentation
As mentioned by @lesnake:matrix.org on matrix channel
2021-08-12 18:21:25 +12:00
Cadence Ember
7062999921
Add many venv names to gitignore 2021-08-11 22:22:23 +12:00
3f57d50893
Retrieve the first 20 comments of a video on /api/v1/comments/:videoid
Got some inspiration from https://github.com/nlitsme/youtube_tool (for the x-youtube-client-X
headers).
This is not a complete reimplementation of Invidious API as continuation is not implemented
(to retrieve more than the first 20 comments and comments replies), likes and replies count
are also missing.
2021-07-02 00:53:05 +12:00
Cadence Ember
1ea86101fd
Support premiere videos on channel 2021-07-01 23:42:53 +12:00
7d3b79b1cd
Change cookies to skip EU cookie consent page
See https://github.com/benbusby/whoogle-search/issues/311 for some
context.
We're now implementing
a726009987/youtube_dl/extractor/youtube.py (L263-L264)
2021-05-15 22:29:44 +12:00
Cadence Ember
18f5ef4c62
Quote json keys correctly 2021-05-14 18:46:46 +12:00
f0c9708d99
Fix search extractor ad section filtering
The ads sections had a carouselAdRenderer property, now they have a
promotedSparklesTextSearchRenderer property instead. As this may
change again in the future, we should just get all items as we
discriminate/filter them as videos afterwards with the videoRenderer
property.
2021-05-14 18:46:08 +12:00
Cadence Ember
57b0a88a2e
Detect channels that do not exist
If error alerts exist, they will be logged. But it is reasonable to
assume that not all errors will be fatal, so we don't necessarily quit
parsing if we find one.

This also normalises the text error of the /latest response for a
missing channel, without changing its identifier.
2021-05-02 01:20:53 +12:00
Cadence Ember
50a4b7af45
Add a handler for ytdl search request message 2021-04-28 00:55:29 +12:00
Cadence Ember
e3595a455e
Remove "unknown download error" prefix
The reason for the error is known and is returned.
2021-04-28 00:08:07 +12:00
7737ea3ba5
Fix #26 append detailed error message returned by yt-dlp in video extractor
Fixes https://todo.sr.ht/~cadence/tube/26
2021-04-18 15:23:28 +12:00
Cadence Ember
5125bb9461
Don't fail if captions field is missing 2021-04-10 12:50:18 +12:00
be8a2dad5f
Remove extraneous " align:start position:0%" on auto-generated captions 2021-04-10 00:44:10 +12:00
Cadence Ember
1d52fca3a0
Support auto-generated captions
The caption extraction is now entirely in our own hands.
2021-04-05 01:23:54 +12:00
bopol
aaf7d65b32
change CONSENT cookie value
FX is accept all tracking, PENDING should imply no tracking
2021-04-04 14:45:54 +12:00
5f47e1a71b
Fix extracting with cookie consent page in EU
Fix #27 use maintained yt-dlp lib instead of youtube-dlc

Because of the following changes in YT, we have to switch to a
maintained library https://github.com/ytdl-org/youtube-dl/issues/28604
While yt-dlp is not fixed today, youtube-dl is fixed in master and as
yt-dlp is quick to merge upstream changes back to their repo, we can
hope the issue will also be fixed there timely.

For requests sent by us directly, we include the cookies.

Ref https://github.com/ytdl-org/youtube-dl/issues/28604
2021-04-03 15:09:58 +13:00
Cadence Ember
fe04a4dbd6
Fix temporary file removal again 2021-04-03 14:57:51 +13:00
Cadence Ember
20fa40dd3d
Add front page 2021-04-03 14:42:30 +13:00
Cadence Ember
ccd3513c46
Add robots.txt 2021-04-03 14:00:05 +13:00
Cadence Ember
0f877b06bc
Fix temporary file removal 2021-03-28 23:58:54 +13:00
ABeltramo
7ed3248104
Docker updates
- Removed default config file from Dockerfile
- added .git folder to dockerignore
2021-03-23 23:31:49 +13:00
Olivier
70c95f4b63
Allow configuring the bind host address and port. 2021-03-12 00:18:07 +13:00
Cadence Ember
6dfceea6a0
Move endpoint status to documentation 2021-02-27 15:08:39 +13:00
Cadence Ember
e8e68150de
Add link to documentation repo 2021-02-27 14:57:03 +13:00
Cadence Ember
abd6c8df2f
Rename to NewLeaf 2021-02-27 13:09:31 +13:00