Commit Graph

53 Commits

Author SHA1 Message Date
Cadence Ember 68cfbb809f
Remove `with requests` when it is unnecessary 2022-01-16 21:51:26 +13:00
Cadence Ember 73b4fbabf7
Do not actually write out pages. 2022-01-10 13:23:04 +13:00
Cadence Ember 36ae18c12f
Report errors when an account has been terminated 2022-01-10 13:02:08 +13:00
bopol 66b7d1bec8
Fix regular captions
This removes all of the code that was previously used to get them from
/timedtext, and instead, always uses whatever is extracted from the
video page.

This does unfortunately now require a whole video fetch just for the
captions. But assuming captions are only requested by a frontend, this
won't be a problem due to the memory cache. The captions link will be
in memory because the just-requested video is in memory too.
2021-12-16 12:23:26 +13:00
Cadence Ember 550b633663
Do not extract storyboards into v/a formats 2021-12-08 22:33:48 +13:00
Cadence Ember e3854a6050
Extract fact check notices to second__clarification 2021-11-04 02:01:52 +13:00
Cadence Ember 65bb7a2c4c
Fix recommended extraction when fact check notice 2021-11-04 01:59:50 +13:00
Cadence Ember d2df18ff75
Fix file detection and recommendations 2021-08-16 21:56:32 +12:00
Lomanic 3f57d50893
Retrieve the first 20 comments of a video on /api/v1/comments/:videoid
Got some inspiration from https://github.com/nlitsme/youtube_tool (for the x-youtube-client-X
headers).
This is not a complete reimplementation of Invidious API as continuation is not implemented
(to retrieve more than the first 20 comments and comments replies), likes and replies count
are also missing.
2021-07-02 00:53:05 +12:00
Cadence Ember 1ea86101fd
Support premiere videos on channel 2021-07-01 23:42:53 +12:00
Lomanic 7d3b79b1cd
Change cookies to skip EU cookie consent page
See https://github.com/benbusby/whoogle-search/issues/311 for some
context.
We're now implementing
a726009987/youtube_dl/extractor/youtube.py (L263-L264)
2021-05-15 22:29:44 +12:00
Lomanic f0c9708d99
Fix search extractor ad section filtering
The ads sections had a carouselAdRenderer property, now they have a
promotedSparklesTextSearchRenderer property instead. As this may
change again in the future, we should just get all items as we
discriminate/filter them as videos afterwards with the videoRenderer
property.
2021-05-14 18:46:08 +12:00
Cadence Ember 57b0a88a2e
Detect channels that do not exist
If error alerts exist, they will be logged. But it is reasonable to
assume that not all errors will be fatal, so we don't necessarily quit
parsing if we find one.

This also normalises the text error of the /latest response for a
missing channel, without changing its identifier.
2021-05-02 01:20:53 +12:00
Cadence Ember 50a4b7af45
Add a handler for ytdl search request message 2021-04-28 00:55:29 +12:00
Cadence Ember e3595a455e
Remove "unknown download error" prefix
The reason for the error is known and is returned.
2021-04-28 00:08:07 +12:00
Lomanic 7737ea3ba5
Fix #26 append detailed error message returned by yt-dlp in video extractor
Fixes https://todo.sr.ht/~cadence/tube/26
2021-04-18 15:23:28 +12:00
Cadence Ember 5125bb9461
Don't fail if captions field is missing 2021-04-10 12:50:18 +12:00
Lomanic be8a2dad5f
Remove extraneous " align:start position:0%" on auto-generated captions 2021-04-10 00:44:10 +12:00
Cadence Ember 1d52fca3a0
Support auto-generated captions
The caption extraction is now entirely in our own hands.
2021-04-05 01:23:54 +12:00
bopol aaf7d65b32
change CONSENT cookie value
FX is accept all tracking, PENDING should imply no tracking
2021-04-04 14:45:54 +12:00
Lomanic 5f47e1a71b
Fix extracting with cookie consent page in EU
Fix #27 use maintained yt-dlp lib instead of youtube-dlc

Because of the following changes in YT, we have to switch to a
maintained library https://github.com/ytdl-org/youtube-dl/issues/28604
While yt-dlp is not fixed today, youtube-dl is fixed in master and as
yt-dlp is quick to merge upstream changes back to their repo, we can
hope the issue will also be fixed there timely.

For requests sent by us directly, we include the cookies.

Ref https://github.com/ytdl-org/youtube-dl/issues/28604
2021-04-03 15:09:58 +13:00
Cadence Ember c8b4699922
Support topic channels with no videos tab
https://second.cadence.moe/api/v1/channels/UCr-iHMODX8D4a6MVQ_RtdQg
2021-02-19 01:17:54 +13:00
Lomanic 80b41c7725
Fix broken channel videos extraction failing with KeyError: 'gridVideoRenderer' 2021-02-19 00:59:55 +13:00
Cadence Ember 268457394f
Split out file cleanup code 2021-01-26 01:05:40 +13:00
Cadence Ember c837828a22
Captions: Error checking 2021-01-20 17:37:39 +13:00
Cadence Ember 8e69928756
Captions: Python code cleanup and optimisation 2021-01-20 17:36:50 +13:00
bopol 6709aa30c2
Implement captions
Automatic subtitles are not supported, because youtube_dlc does not
provide them.
2021-01-20 17:36:49 +13:00
Cadence Ember 39425f994a
Fix subscriber count extraction 2021-01-17 14:56:17 +13:00
Cadence Ember f1ddf66f50
Touch up Bopol's patch 2021-01-17 14:55:57 +13:00
bopol 6cc921c2dc
fix channel extraction when header is not available 2021-01-17 14:30:34 +13:00
Cadence Ember 20b133dbb6
Fix manifest 2020-12-18 19:54:06 +13:00
Cadence Ember e95d814709
Fix channel extraction when subscribers not available 2020-12-09 16:53:22 +13:00
Cadence Ember 10f8009101
Gracefully fail on feed fetch for invalid channel 2020-12-06 15:39:28 +13:00
Cadence Ember 554cd8cc3a
Improve ytInitialData extraction 2020-12-03 17:00:06 +13:00
Cadence Ember ba88c53857
Fix search; use youtube-dlc 2020-12-03 16:32:31 +13:00
Cadence Ember 87c7730fbc
Fetch pages using en locale 2020-10-25 18:02:05 +13:00
Cadence Ember 861f441f9f
Fix search 2020-10-24 00:36:20 +13:00
Cadence Ember 0b9874a4f4
Fix channels having videos 2020-10-04 18:38:41 +13:00
Cadence Ember e1bcc306b3
Fix for if channel has no videos 2020-10-03 01:17:23 +13:00
Cadence Ember c506f65c71
Use empty string instead of null if no description 2020-09-24 01:06:47 +12:00
Cadence Ember caee795b7e
Fix extracting empty description 2020-09-24 00:56:16 +12:00
Cadence Ember e18efc9591
Thread lock when using channel data cache 2020-09-06 00:31:17 +12:00
Cadence Ember 52b3ae07b1
Detect being rate limited 2020-09-01 01:17:17 +12:00
Cadence Ember 951c62d1a9
Remove useless print 2020-08-31 20:27:17 +12:00
Cadence Ember 105161299f
TTL cache for channel latest 2020-08-31 03:16:57 +12:00
Cadence Ember 0a6a07838d
Add publishedText to /channels/latest 2020-08-31 02:26:46 +12:00
Cadence Ember c2a7bb907b
Experimental: filter specific itags for dash video 2020-08-30 01:16:18 +12:00
Cadence Ember 4a4f48e9d9
Author banners and thumbnails are optional 2020-08-30 00:48:33 +12:00
Cadence Ember c832142b1d Experimental fragment_base_url 2020-08-27 17:22:18 +12:00
Cadence Ember ddf52e6346 Add second__order field 2020-08-25 01:47:56 +12:00