Commit Graph

815 Commits

Author SHA1 Message Date
Kavin 6d59cdbe3a
Add support for extracting audio tracks. 2022-11-13 21:39:29 +00:00
Isira Seneviratne e4d982c7ea Fix license. 2022-11-12 07:29:15 +05:30
Isira Seneviratne ddbce3b83d Add Utils methods for URL encoding/decoding using UTF-8. 2022-11-12 07:29:15 +05:30
Isira Seneviratne 366f5c1632 Use StandardCharsets.UTF_8. 2022-11-12 07:29:15 +05:30
Isira Seneviratne 316d8573fa Use immutable sets in YoutubeParsingHelper. 2022-11-07 07:50:26 +05:30
AudricV 61ce041bda
[YouTube] Support handles and all custom channel names
More non-channel paths have been also added to the excluded custom name paths,
documentation and exception messages have been improved and fixed in some
places, and the licence header of YoutubeChannelLinkHandlerFactory has been
moved to its beginning and updated.
2022-11-03 19:46:42 +01:00
AudricV ffffb04439
Merge pull request #953 from Theta-Dev/attributed-text-desc
[YouTube] Add support for attributed text description
2022-11-03 18:34:30 +01:00
ThetaDev 592e1d6386 fix: parsing attributed description with no command runs 2022-11-03 12:10:52 +01:00
ThetaDev 099b53cc4f
[YouTube] Add parser for attributedDescription
Also update the mock of the next InnerTube endpoint response of the
YoutubeStreamExtractorDefaultTest.DescriptionTestUnboxing test class with an
attributedDescription instead of a regular description
2022-11-02 23:11:33 +01:00
Theta-Dev 20e4a35814
[YouTube] Support richGridRenderer on channel pages
YouTube is deploying a new layout on their channel pages, which uses richGridRenderer JSON objects.
2022-11-02 19:01:29 +01:00
AudricV 4cae66f1f9
Merge pull request #946 from chowder/dev
Add ability to identify short-form `StreamInfoItem`s
2022-11-01 12:19:58 +01:00
Tobi eb40bb8458
Merge pull request #959 from FireMasterK/playlist-info-item-uploader
Add uploaderUrl and uploaderVerified to PlaylistInfoItem.
2022-10-31 13:10:33 +01:00
chowder b1a899fd47 Fix null pointer exception 2022-10-31 11:12:23 +00:00
Stypox a4db106a66
Merge pull request #960 from AudricV/yt-workaround-403-errors-android-client
[YouTube] Workaround 403 HTTP errors of ANDROID client streams
2022-10-30 21:53:26 +01:00
Kavin 6a256d0631
Add uploader url and verified to PlaylistInfoItem. 2022-10-30 13:00:19 +00:00
Kavin f9bd08c649 Address reviews. 2022-10-30 01:25:30 +00:00
Caleb 9282c3c13b Fix exception message for YoutubeStreamInfoItemExtractor#isShortFormContent
Co-authored-by: AudricV <74829229+AudricV@users.noreply.github.com>
2022-10-30 01:23:15 +00:00
chowder daf5674951 Add ability to identify short-form StreamInfoItems 2022-10-30 01:23:12 +00:00
AudricV 7258a53225
[YouTube] Support new playlist layout
This new layout doesn't provide author thumbnails and is completely different
for metadata, so the code to get them has been refactored.

The code of learning playlists video count check has been also removed, as it
seems to be not relevant anymore (the video count seems to be returned for
these playlists with both layouts).

Finally, unneeded overrides of subchannel methods, which don't apply to the
YouTube service, have been removed.
2022-10-29 18:12:10 +02:00
AudricV 60e97cd274
[YouTube] Workaround getting streaming URLs returning 403 HTTP response codes
Using the player parameters used to get stories seems to fix the issue, which
affects currently only certain countries such as UK.

This is a workaround and should be fixed in a better way (by changing the
InnerTube additional client used for videos or finding what is now required in
Android player requests).
2022-10-29 17:58:33 +02:00
AudricV c230d84df1
Fix Checkstyle error in YoutubeCommentsInfoItemExtractor 2022-10-29 13:24:19 +02:00
xz-dev 0ffcb32d9c
[YouTube] Add comment reply count support (#936)
Add comment reply count support for YouTube and introduce `CommentsInfoItem.UNKNOWN_REPLY_COUNT` constant

Co-authored-by: AudricV <74829229+AudricV@users.noreply.github.com>
Co-authored-by: Tobi <TobiGr@users.noreply.github.com>
2022-10-15 12:40:06 +02:00
AudricV 8067c43837
[YouTube] Don't use a specific letter for the decryption function name pattern
Use the same possible characters for variables everywhere, in order to avoid
potential future throttling parameter decryption function name parsing issues
related to the usage of other letter(s) than b.
2022-09-24 21:49:22 +02:00
AudricV abcee87167
[YouTube] Fix throttling parameter decryption function regex
- Quote the function name, as it may contain special regex symbols, such as
  dollar;
- Support multiple lines;
- Use what looks like the end of the function for the end of the regex (this
  part is inspired from yt-dlp throttling parameter decryption regex);
- Move the throttling function body regex into a private and static constant.
2022-09-24 21:28:09 +02:00
ThetaDev 7244be7627
[YouTube] Add JavaScript lexer to parse completely throttling decryption function (#905) 2022-09-24 19:55:46 +02:00
Stypox a99af9bb6e
Merge pull request #927 from Stypox/fix-feed-thumbnail
[YouTube] Return mqdefault thumbnails in fast feed
2022-09-19 08:52:01 +02:00
Stypox 93ef73e2f2
[YouTube] Return mqdefault thumbnails in feed
The hqdefault thumbnails, which are used by default, have black bars at the top and at the bottom, and are thus not optimal.
2022-09-13 16:46:54 +02:00
AudricV d8ddeb549c
[YouTube] Support new trending structure and filter recently trending section
This new structure allow us to filter easily Trending shorts and Recently
trending sections.

On the previous one, this Recently trending section is now filtered, by
checking whether sections have a title, which isn't the case for normal trends
contrary to the other ones.

This makes that the extractor returns now only the real 50 "Now" YouTube
trends.

Elements inside arrays are now extracted dynamically instead of only the ones
of the first index, using Java 8's Stream API.

The getInitialPage() method of YoutubeTrendingExtractor can now throw a
ParsingException if no selected tab (corresponding to the one of the trends
type extracted) has been found.

Finally, the licence header has been moved to the top of the file and updated.
2022-09-11 18:52:12 +02:00
AudricV 884da40f65
Merge pull request #926 from AudricV/fix-yt-like-count-extraction
[YouTube] Fix extraction of like count with the new data model
2022-09-11 12:21:38 +02:00
AudricV 119b9129e2
[YouTube] Fix extraction of like count with the new data model
In the new data model currently A/B tested or deployed, the like count button
data is the same, but its path has been changed.

The extraction of the like count has been also improved, by using the multiple
accessibility data available instead of the only one which was used before.
This accessibility data has been also deprioritized, because it is language
dependent when there is no view.

Like count is always known, even for age-restricted videos, so returning -1 if
the like count extraction fails on this type of videos is not needed and hides
an extraction error: that's the reason why this return has been removed and the
exception will always be thrown, even if a video is age-restricted.
2022-09-10 23:28:38 +02:00
ThetaDev 4905f74700
[YouTube] Fix throttling parameter decryption on Android
Escape the curly brace in the regular expression used to parse the throttling
parameter decryption function to allow its compatibility on Android.
2022-09-10 17:03:39 +02:00
Isira Seneviratne 943b7c033b Remove EMPTY_STRING. 2022-08-24 06:59:17 +05:30
litetex d6577e5e0b
Merge pull request #882 from litetex/fix-all-tests
Fix all tests
2022-08-23 16:19:27 +02:00
AudricV 03d9a4fe9d
Fix typos in YoutubeStreamExtractor.tryDecryptUrl 2022-08-21 21:17:03 +02:00
litetex 9e93d6b193 YoutubeThrottlingDecrypter: Check if returned string is a valid JavaScript function 2022-08-21 20:16:45 +02:00
litetex ed0a07af4e YoutubeThrottlingDecrypter: Patch regex 2022-08-21 20:15:53 +02:00
litetex 8ff7a90f52 Improved consent cookie related constants and documentation 2022-08-21 18:41:40 +02:00
AudricV cb64a480cd
[YouTube] Remove deprecated code in YoutubeThrottlingDecrypter
YoutubeThrottlingDecrypter is now a non-instantiable final class and non-static
attributes have been removed.

The static attributes of this class have been renamed, in order to respect the
naming specification used.
2022-08-16 15:01:05 +02:00
litetex c1a72b8240 Use Android Studios code style 2022-08-14 14:48:29 +02:00
litetex fc27b8a5b8 Use preexisting ``ContentNotAvailableException`` 2022-08-14 14:48:29 +02:00
litetex ecfc370685 Fixed all YTMixPlaylists
Added option to choose if you want to consent or not - currently this is done by a static variable in ``YoutubeParsingHelper`` - may not be the best long-term solution but for now the tests work again (in EU countries) 🥳
2022-08-14 14:48:27 +02:00
litetex 2b6fe294b2 Fixed ``YoutubeMusicSearchExtractorTest``
The renderers and queries got changed.
2022-08-14 14:48:27 +02:00
AudricV 7bdca33a87
[YouTube] Ensure that an additional player response is the correct one
If YouTube detect that requests come from a third party client, they may
replace the real player response by another one of a video saying that this
content is not available on this app and to watch it on the latest version of
YouTube. We can detect this by checking whether the video ID of the player
response returned is the same as the one requested by the extractor.
2022-08-12 19:20:31 +02:00
AudricV c82317e318
[YouTube] Spoof more mobile clients
Additional parameters have been added to the player requests of ANDROID and IOS
clients:

- for both clients: osName and osVersion: their respective values are:
  - for the ANDROID one: Android and 12;
  - for the IOS one: iOS and 15.6.0.19G71.
- for the ANDROID client: androidTargetSdkVersion, with the Android SDK version
  corresponding to the Android version used in the player requests of this
  client. This parameter is now required with this client to be sure to get a
  correct player response, otherwise, the one of a video saying that this
  content is not available in this app and to watch it with the latest version
  of YouTube can be returned instead;
- for the IOS client: deviceMake, with Apple as its value.

The iOS version sent in the IOS client player requests has been also updated to
the version 15.6 of the OS.

Finally, a comment about the requirement to use the signature timestamp from
the player JavaScript base file for HTML5 player requests on videos with
obfuscated URLs has been added and replaces a previous one which may be not
true.
2022-08-12 19:20:31 +02:00
AudricV d0549a5a52
[YouTube] Update client versions and use a real version for the iOS client
The iOS version can be got easily in fact, by looking at the What's New section of the App Store' app page.
2022-08-12 19:20:31 +02:00
AudricV d7e678aca2
[YouTube] Improve WEB client version and API key HTML extraction
Common code in WEB client version HTML extraction has been deduplicated, usage of the Java 8 Stream API has been made and initial data fallback has been used as a last resort.
This means that the client version extraction from regexes will be used before this fallback, as it doesn't contain the full client version.
This can be used as a way to fingerprint the extractor, even if it seems to be not the case.
2022-08-12 19:20:30 +02:00
AudricV 5b548340e8
[YouTube] Catch any exception in YoutubeThrottlingDecrypter.apply and improve docs
This will prevent any future extractor break due to decryption failure, like it was excepted to be the case before.

Some documentation about the throttling decryption has been also improved.
2022-08-12 17:49:36 +02:00
Isira Seneviratne 1af6b8eedb Use Collections.singletonList(). 2022-07-27 07:35:57 +05:30
Isira Seneviratne ff60e05c76 Use Collections.singletonMap(). 2022-07-27 07:35:57 +05:30
Stypox 5ab74b3631
Merge pull request #857 from FireMasterK/video-title
Get original untranslated title for YouTube
2022-07-06 10:26:45 +02:00