Commit Graph

1528 Commits

Author SHA1 Message Date
AudricV 30a0f8c510
[MediaCCC] Extract audio language property for single language audio tracks 2023-02-26 19:06:18 +01:00
AudricV 05e8cb39f7
[YouTube] Add language and descriptive audio properties to DASH manifests 2023-02-26 19:06:17 +01:00
AudricV 76b7c19c5d
[YouTube] Extract whether a track is a descriptive audio and audio locale when available
Also use audio track setters only for audio itags.
2023-02-26 19:06:17 +01:00
AudricV 3bb5eeef30
[YouTube] Add descriptive and locale audio support in ItagItem 2023-02-26 19:06:16 +01:00
AudricV 14bf3fb05b
Add ability to know the locale of an audio stream
Getting audio tracks locales by parsing their ID or their label, should not be
done by clients, but by the extractor.

This commit adds the ability to store the Locale of an AudioStream, which is
used to compare similar AudioStreams (in the equalStats method).
2023-02-26 19:06:16 +01:00
AudricV f92426560c
Add descriptive audio properties
Also improve AudioStream's audio language documentation
2023-02-26 19:06:16 +01:00
AudricV 1556adbb2d
[YouTube] Fix hashtags links extraction and escape text in attribute descriptions + HTML links
webCommandMetadata object is contained inside a commandMetadata one, so it is
not accessible from the root of the navigationEndpoint object.

The corresponding statement has been moved at the bottom of the specific
endpoints parsing, as the webCommandMetadata object is present almost
everywhere, otherwise URLs of some endpoints would have be changed, such as
uploader URLs (from channel IDs to handles).

As no ParsingException is now thrown by getUrlFromNavigationEndpoint, and so by
getTextFromObject, getUrlFromObject and getTextAtKey, the methods which were
catching ParsingExceptions thrown by these methods had to be updated.

URLs got in the HTML version of getTextFromObject are now escaped properly to
provide valid HTML to clients. This has been also done for attribute
descriptions, with the description text for this type of descriptions.

As YouTube descriptions are in HTML format (except for the fallback on the JSON
player response, which is plain text and only happens when there is no visual
metadata or a breaking change), all URLs returned are escaped, so tests which
are testing presence of URLs with escaped characters had to be updated (it was
only the case for YoutubeStreamExtractorDefaultTest.DescriptionTestUnboxing).
2023-02-26 18:43:36 +01:00
petlyh f7a7a236fb
[Bandcamp] Show comments as disabled on radio streams 2023-02-23 18:42:43 +01:00
TobiGr 3f7df9536e [YouTube] Fix getting the comment text if the comment contains a hashtag 2023-01-29 20:33:51 +01:00
Stypox 999fb7f812
Merge pull request #1024 from AudricV/snd_fix-tracks-like-count
[SoundCloud] Fix extraction of tracks like count
2023-01-29 10:52:54 +01:00
Stypox 3519d4c367
Merge pull request #1015 from AudricV/yt_fix-channel-id-rss-feeds
[YouTube] Fix channel ID extraction of YouTube channels RSS feeds
2023-01-29 10:41:38 +01:00
Stypox 9aca710e86
Merge pull request #1013 from Stypox/fix-music-mixes
[YouTube] Now music mixes can be treated as normal mixes
2023-01-29 09:48:51 +01:00
Stypox 76eeabac45
Merge pull request #1020 from TeamNewPipe/fix/yt-subscriber-count
[YouTube] Fix NPE in search when getting channel items without subscriber count
2023-01-29 09:44:22 +01:00
AudricV 2a24d407d5
[SoundCloud] Fix extraction of tracks like count
SoundCloud is using likes_count to return the like count of a track, like it
was the case before they switched to favoritings_count.
2023-01-29 01:00:49 +01:00
AudricV 57f850bc2d
[YouTube] Support live URLs and do minor improvements to YoutubeStreamLinkHandlerFactory
- Move license header at the top;
- Use an unmodifiable set for the subpaths instead of a modifiable list;
- Add missing Nonnull and Nullable annotations;
- Improve exception messages.
2023-01-28 19:36:20 +01:00
AudricV 1f4ed9dce9
[YouTube] Fix channel ID extraction of YouTube channel RSS feeds
The yt:channelId element doesn't provide the channel ID anymore and is empty,
like the id element, so we need now to extract it from the channel URL provided
in two elements: author -> uri and feed -> link.

Also avoid a NullPointerException in getUrl and getName methods.
2023-01-28 11:53:33 +01:00
Tobi c589a2c1a2
Merge pull request #1014 from TeamNewPipe/fix/yt-comments
[YouTube] Fix getting next comments pages
2023-01-27 11:14:55 +01:00
TobiGr 72573932cf [YouTube] Fix NPE in search when getting channel items without subscriber count 2023-01-24 23:03:45 +01:00
TobiGr f50b7275af [YouTube] Fix getting next comments pages 2023-01-24 22:39:08 +01:00
Kunal 9bdad40b06 Removed topStandaloneBadge 2023-01-20 02:41:21 +05:30
Stypox 7293991832
[YouTube] Now music mixes can be treated as normal mixes
Using a playlist extractor on them would result in "Unviewable playlist" errors
2023-01-15 23:28:59 +01:00
Stypox c1040bccac
Merge pull request #794 from FireMasterK/comments-count
[YouTube] Add support to extract total comment count
2023-01-11 15:32:19 +01:00
TobiGr 56aab4d971 [YouTube] Fix escaping links in YouTubeParsingHelper.getTextFromObject 2023-01-05 00:28:12 +01:00
Kavin 22a47da8c7
Fix requested change and remove outdated comment. 2023-01-02 20:42:32 +00:00
Kavin 98a90fd9c8
Don't cache comments count and return early on page fetch if no token. 2023-01-02 20:40:48 +00:00
Kavin 2974dfaa48
Only store ajaxJson for initial page and eager fetch the initial continuation. 2023-01-02 20:40:48 +00:00
Kavin 64d24aa09e
Fix request changes. 2023-01-02 20:40:48 +00:00
Kavin 67ef4f4c30
Cleanup and remove optional. 2023-01-02 20:40:48 +00:00
FireMasterK 22f71b010c
Fix for requested changes. 2023-01-02 20:40:48 +00:00
FireMasterK 656b7c1cd9
Improve method documentation. 2023-01-02 20:40:48 +00:00
FireMasterK 981aee4092
Add support to extract total comment count. 2023-01-02 20:40:48 +00:00
Stypox 45636b0d00
Merge pull request #986 from Isira-Seneviratne/Static_maps
Use immutable Map factory methods.
2023-01-02 18:11:14 +01:00
Stypox 219c5c5be5
Update extractor/src/main/java/org/schabi/newpipe/extractor/services/youtube/YoutubeParsingHelper.java 2023-01-02 18:11:03 +01:00
Stypox 259de3cba6
Merge pull request #995 from TeamNewPipe/feat/soundcloud-playlistinfoitemextractor
[SoundCloud] Implement getUploaderUrl() and isUploaderVerified() for PlaylistInfoItemExtractor
2023-01-02 15:10:40 +01:00
Stypox 991394b53a
Merge pull request #1005 from FireMasterK/fix-escaping-xss
Fix for potential XSS attacks and formatting issues
2023-01-02 15:06:17 +01:00
Isira Seneviratne d8ce08d969 Use immutable Map factory methods. 2023-01-02 07:50:31 +05:30
Kavin 01acf79436
Fix for potential XSS attacks. 2022-12-31 20:05:32 +00:00
TobiGr 292e0d8ce7 [SoundCloud] Implement getUploaderUrl() and isUploaderVerified() for PlaylistInfoItemExtractor 2022-12-31 18:46:39 +01:00
TobiGr 2a8729aeb2 Apply suggestions
Co-authored-by: Stypox <stypox@pm.me>
2022-12-31 18:24:33 +01:00
TobiGr d75a997611 [PeerTube] Support searching for channels 2022-12-31 18:24:33 +01:00
TobiGr dea6d8ce4c [PeerTube] Support searching for playlists 2022-12-31 18:24:33 +01:00
Stypox 95cc6aefbb
Merge pull request #994 from TeamNewPipe/fix/peertube-subtitles-exception
[PeerTube] Report Exceptions thrown while getting a stream's subtitles
2022-12-31 15:01:39 +01:00
Stypox 7b54457789
Merge pull request #941 from TeamNewPipe/feat/peertube-comment-replies
[PeerTube]  Support comment replies
2022-12-31 14:57:51 +01:00
AudricV f45966d449
Merge pull request #910 from Isira-Seneviratne/Locale_forLanguageTag
Add compat Locale.forLanguageTag() implementation.
2022-12-24 23:53:30 +01:00
AudricV d5437e0bc5
Merge pull request #863 from AudricV/add-content-type-and-content-length-headers-to-post-requests
Add Content-Type header to all POST requests without an empty body
2022-12-16 19:32:56 +01:00
AudricV 0766b1d211
[YouTube] Improve YoutubeStreamInfoItemExtractor
- Return duration of video premieres;
- Add another non-localized method to determine whether a stream is a running
livestream;
- Return view count and upload date of videos in playlists;
- Store isPremiere result;
- Remove shorts workaround code, as it was only useful on channels and shorts
have been moved into a separated channel tab;
- Improve some other code.
2022-12-08 13:59:12 +01:00
Tobi 896d7e09eb
Merge pull request #978 from Theta-Dev/fix/search-channel-handles
[YouTube] Fix search subscriber count extraction with channel handles
2022-12-05 17:52:05 +01:00
TobiGr cd3262745d [PeerTube] Report Exceptions thrown while getting a stream's subtitles 2022-12-03 16:11:21 +01:00
TobiGr 4e66b2287e [PeerTube] Add support for comment replies 2022-12-01 14:05:18 +01:00
Tobi 41c8dce452
Merge pull request #992 from Isira-Seneviratne/String_isBlank
Use String.isBlank().
2022-11-30 17:48:54 +01:00
Isira Seneviratne 2bca56f0df Use String.isBlank(). 2022-11-30 08:26:21 +05:30
Isira Seneviratne 3b80547976 Add code review suggestions. 2022-11-30 07:57:45 +05:30
ThetaDev 016623131e docs: update comment in YoutubeChannelInfoItemExtractor 2022-11-29 19:06:03 +01:00
Kavin abf08e1496
Merge pull request #990 from FireMasterK/bold-italic-strikethrough
[YouTube] Implement bold/italic/strike-through support
2022-11-29 15:59:38 +00:00
Kavin 52fda37915
Implement bold/italic/strike-through support. 2022-11-28 19:06:18 +00:00
Kavin b566084cac
Use Description object for comments text. 2022-11-28 17:02:19 +00:00
Tobi 1da0190056
Merge pull request #980 from TeamNewPipe/fix/yt/unavailable
[YouTube] Fix extracting the detailed error message for unavailable streams
2022-11-28 10:07:34 +01:00
Stypox 60fb30f835
Merge pull request #928 from FireMasterK/comment-urls
Parse YouTube comments as HTML
2022-11-27 19:16:34 +01:00
Kavin 5abea22225
Fix throwing correct reason. 2022-11-26 21:09:08 +00:00
Kavin c043597255
Update supported countries list. 2022-11-26 19:01:33 +00:00
TobiGr 4680df0bdf Fix throwing correct reason 2022-11-23 17:03:22 +01:00
TobiGr 9de8405c9f [YouTube] Fix extracting the detailed error message of streams which are unavailable 2022-11-23 08:33:06 +01:00
AudricV 3891542ca1
Use Downloader's postWithContentType and postWithContentTypeJson methods in services and extractors 2022-11-22 11:37:18 +01:00
AudricV b2862f3cd1
Add postWithContentType and postWithContentTypeJson utility methods in Downloader
Co-authored-by: Stypox <stypox@pm.me>
2022-11-22 11:37:17 +01:00
AudricV e9a0d3bd95
[YouTube] Send Content-Type header in all POST requests
This header was not sent partially before and was added and guessed by OkHttp. This can create issues when using other HTTP clients than OkHttp, such as Cronet.

Some code in the modified classes has been improved and / or deduplicated, and usages of the UTF_8 constant of the Utils class has been replaced by StandardCharsets.UTF_8 where possible.

Note that this header has been not added in except in YoutubeDashManifestCreatorsUtils, as an empty body is sent in the POST requests made by this class.
2022-11-22 11:37:16 +01:00
AudricV b9e463de49
[Bandcamp] Send Content-Type header in POST requests
This header was not sent before and was added and guessed by OkHttp. This can create issues when using other HTTP clients than OkHttp, such as Cronet.

Also make use of StandardCharsets.UTF_8 when getting bytes of bodies instead of the platform default's charset, to make sure to prevent some encoding issues on some JVMs.
2022-11-22 11:35:46 +01:00
AudricV 65d6321e3d
Fix typos in Downloader.post JavaDocs
Post methods in Downloader return the result of a POST request and not the one of a GET request.
2022-11-22 11:35:46 +01:00
ThetaDev 5daabd1793 fix: #976 search subscriber count extraction with channel handles 2022-11-22 02:17:10 +01:00
Kavin c953e23414
Merge pull request #968 from AudricV/yt-support-no-video-info-renderers-for-streams
[YouTube] Support lack of video info renderers for streams
2022-11-16 20:20:01 +00:00
Tobi 2211a24b69
Merge pull request #971 from lrusso96/patch-1
[YouTube] Improve duration parsing
2022-11-16 16:14:54 +01:00
Kavin 86f06b333a
Address review. 2022-11-14 00:05:31 +00:00
Kavin 30909da1df
Fix audio track similar comparison for IDs. 2022-11-13 23:08:54 +00:00
Kavin 6d59cdbe3a
Add support for extracting audio tracks. 2022-11-13 21:39:29 +00:00
Isira Seneviratne e4d982c7ea Fix license. 2022-11-12 07:29:15 +05:30
Isira Seneviratne ddbce3b83d Add Utils methods for URL encoding/decoding using UTF-8. 2022-11-12 07:29:15 +05:30
Isira Seneviratne 366f5c1632 Use StandardCharsets.UTF_8. 2022-11-12 07:29:15 +05:30
Luigi Russo c9635218e2
[YouTube] Improve duration parsing 2022-11-09 09:41:29 +01:00
Isira Seneviratne 316d8573fa Use immutable sets in YoutubeParsingHelper. 2022-11-07 07:50:26 +05:30
AudricV aa9a8ca23c
[YouTube] Make non-extraction of videoPrimaryInfoRenderer and/or videoSecondaryInfoRenderer not fatal
Also de-duplicated common code related to the obtain of these video info renderers.

This change allows extraction of videos without visual metadata.
2022-11-04 18:35:53 +01:00
AudricV 61ce041bda
[YouTube] Support handles and all custom channel names
More non-channel paths have been also added to the excluded custom name paths,
documentation and exception messages have been improved and fixed in some
places, and the licence header of YoutubeChannelLinkHandlerFactory has been
moved to its beginning and updated.
2022-11-03 19:46:42 +01:00
AudricV ffffb04439
Merge pull request #953 from Theta-Dev/attributed-text-desc
[YouTube] Add support for attributed text description
2022-11-03 18:34:30 +01:00
ThetaDev 592e1d6386 fix: parsing attributed description with no command runs 2022-11-03 12:10:52 +01:00
ThetaDev 099b53cc4f
[YouTube] Add parser for attributedDescription
Also update the mock of the next InnerTube endpoint response of the
YoutubeStreamExtractorDefaultTest.DescriptionTestUnboxing test class with an
attributedDescription instead of a regular description
2022-11-02 23:11:33 +01:00
Theta-Dev 20e4a35814
[YouTube] Support richGridRenderer on channel pages
YouTube is deploying a new layout on their channel pages, which uses richGridRenderer JSON objects.
2022-11-02 19:01:29 +01:00
AudricV 4cae66f1f9
Merge pull request #946 from chowder/dev
Add ability to identify short-form `StreamInfoItem`s
2022-11-01 12:19:58 +01:00
Tobi eb40bb8458
Merge pull request #959 from FireMasterK/playlist-info-item-uploader
Add uploaderUrl and uploaderVerified to PlaylistInfoItem.
2022-10-31 13:10:33 +01:00
chowder b1a899fd47 Fix null pointer exception 2022-10-31 11:12:23 +00:00
Stypox a4db106a66
Merge pull request #960 from AudricV/yt-workaround-403-errors-android-client
[YouTube] Workaround 403 HTTP errors of ANDROID client streams
2022-10-30 21:53:26 +01:00
Kavin b441910257
Mark uploaderUrl as nullable. 2022-10-30 13:36:04 +00:00
chowder 3fdc0e72cc Line breaks for long docstrings 2022-10-30 13:28:39 +00:00
Kavin 6a256d0631
Add uploader url and verified to PlaylistInfoItem. 2022-10-30 13:00:19 +00:00
Tobi 430504b4b5
Merge pull request #958 from AudricV/yt-playlists-support-new-metadata-format
[YouTube] Support new metadata format of playlists
2022-10-30 12:31:43 +01:00
Kavin f9bd08c649 Address reviews. 2022-10-30 01:25:30 +00:00
Caleb 9282c3c13b Fix exception message for YoutubeStreamInfoItemExtractor#isShortFormContent
Co-authored-by: AudricV <74829229+AudricV@users.noreply.github.com>
2022-10-30 01:23:15 +00:00
Caleb c5216f7c12 Update docstring for StreamExtractor#isShortFormContent
Co-authored-by: AudricV <74829229+AudricV@users.noreply.github.com>
2022-10-30 01:23:15 +00:00
chowder 4cccd33f3d Implement isShortFormContent for StreamExtractor and StreamInfo 2022-10-30 01:23:15 +00:00
chowder daf5674951 Add ability to identify short-form StreamInfoItems 2022-10-30 01:23:12 +00:00
Tobi 3d314169b9
Merge pull request #943 from TeamNewPipe/fix/sc/comments
[SoundCloud] Fix getting more comments
2022-10-29 22:19:50 +02:00
AudricV 7258a53225
[YouTube] Support new playlist layout
This new layout doesn't provide author thumbnails and is completely different
for metadata, so the code to get them has been refactored.

The code of learning playlists video count check has been also removed, as it
seems to be not relevant anymore (the video count seems to be returned for
these playlists with both layouts).

Finally, unneeded overrides of subchannel methods, which don't apply to the
YouTube service, have been removed.
2022-10-29 18:12:10 +02:00
AudricV 60e97cd274
[YouTube] Workaround getting streaming URLs returning 403 HTTP response codes
Using the player parameters used to get stories seems to fix the issue, which
affects currently only certain countries such as UK.

This is a workaround and should be fixed in a better way (by changing the
InnerTube additional client used for videos or finding what is now required in
Android player requests).
2022-10-29 17:58:33 +02:00
AudricV c230d84df1
Fix Checkstyle error in YoutubeCommentsInfoItemExtractor 2022-10-29 13:24:19 +02:00
xz-dev 0ffcb32d9c
[YouTube] Add comment reply count support (#936)
Add comment reply count support for YouTube and introduce `CommentsInfoItem.UNKNOWN_REPLY_COUNT` constant

Co-authored-by: AudricV <74829229+AudricV@users.noreply.github.com>
Co-authored-by: Tobi <TobiGr@users.noreply.github.com>
2022-10-15 12:40:06 +02:00
Isira Seneviratne b90a566dd8 Add backport implementation of Locale.forLanguageTag(). 2022-10-12 09:21:39 +05:30
Isira Seneviratne b232c29d22 Use Locale.forLanguageTag(). 2022-10-12 09:21:38 +05:30
TobiGr 4d136599bd [SoundCloud] Fix getting more comments 2022-10-11 15:44:54 +02:00
Tobi a822e91909
Merge pull request #939 from TurtleArmyMc/fix/SoundcloudPlaylistExtractor_track_order
Fix SoundcloudPlaylistExtractor: tracks are in correct order
2022-10-10 22:28:18 +02:00
TobiGr 02810a7db7 Add a comment 2022-10-10 22:22:12 +02:00
Tobi 54092fc3c7
Merge pull request #930 from Isira-Seneviratne/Avoid_Comparator_NPE
Avoid possible NullPointerException in MediaCCCRecentKiosk.
2022-10-10 11:43:34 +02:00
TurtleArmyMc bf70d32eb4 Fix SoundcloudPlaylistExtractor: tracks are in correct order 2022-09-30 15:26:25 -04:00
AudricV 8067c43837
[YouTube] Don't use a specific letter for the decryption function name pattern
Use the same possible characters for variables everywhere, in order to avoid
potential future throttling parameter decryption function name parsing issues
related to the usage of other letter(s) than b.
2022-09-24 21:49:22 +02:00
AudricV abcee87167
[YouTube] Fix throttling parameter decryption function regex
- Quote the function name, as it may contain special regex symbols, such as
  dollar;
- Support multiple lines;
- Use what looks like the end of the function for the end of the regex (this
  part is inspired from yt-dlp throttling parameter decryption regex);
- Move the throttling function body regex into a private and static constant.
2022-09-24 21:28:09 +02:00
ThetaDev 7244be7627
[YouTube] Add JavaScript lexer to parse completely throttling decryption function (#905) 2022-09-24 19:55:46 +02:00
Isira Seneviratne 0e31c86aee Avoid possible NullPointerException in MediaCCCRecentKiosk. 2022-09-21 05:40:07 +05:30
Stypox a99af9bb6e
Merge pull request #927 from Stypox/fix-feed-thumbnail
[YouTube] Return mqdefault thumbnails in fast feed
2022-09-19 08:52:01 +02:00
Kavin 4e14644c41
Ensure comments are parsed with URLs and timestamps. 2022-09-17 16:03:39 +05:30
Stypox 93ef73e2f2
[YouTube] Return mqdefault thumbnails in feed
The hqdefault thumbnails, which are used by default, have black bars at the top and at the bottom, and are thus not optimal.
2022-09-13 16:46:54 +02:00
AudricV d8ddeb549c
[YouTube] Support new trending structure and filter recently trending section
This new structure allow us to filter easily Trending shorts and Recently
trending sections.

On the previous one, this Recently trending section is now filtered, by
checking whether sections have a title, which isn't the case for normal trends
contrary to the other ones.

This makes that the extractor returns now only the real 50 "Now" YouTube
trends.

Elements inside arrays are now extracted dynamically instead of only the ones
of the first index, using Java 8's Stream API.

The getInitialPage() method of YoutubeTrendingExtractor can now throw a
ParsingException if no selected tab (corresponding to the one of the trends
type extracted) has been found.

Finally, the licence header has been moved to the top of the file and updated.
2022-09-11 18:52:12 +02:00
AudricV 884da40f65
Merge pull request #926 from AudricV/fix-yt-like-count-extraction
[YouTube] Fix extraction of like count with the new data model
2022-09-11 12:21:38 +02:00
AudricV 119b9129e2
[YouTube] Fix extraction of like count with the new data model
In the new data model currently A/B tested or deployed, the like count button
data is the same, but its path has been changed.

The extraction of the like count has been also improved, by using the multiple
accessibility data available instead of the only one which was used before.
This accessibility data has been also deprioritized, because it is language
dependent when there is no view.

Like count is always known, even for age-restricted videos, so returning -1 if
the like count extraction fails on this type of videos is not needed and hides
an extraction error: that's the reason why this return has been removed and the
exception will always be thrown, even if a video is age-restricted.
2022-09-10 23:28:38 +02:00
ThetaDev 4905f74700
[YouTube] Fix throttling parameter decryption on Android
Escape the curly brace in the regular expression used to parse the throttling
parameter decryption function to allow its compatibility on Android.
2022-09-10 17:03:39 +02:00
Isira Seneviratne 41254ae12a Remove annotation. 2022-08-24 06:59:17 +05:30
Isira Seneviratne 943b7c033b Remove EMPTY_STRING. 2022-08-24 06:59:17 +05:30
litetex d6577e5e0b
Merge pull request #882 from litetex/fix-all-tests
Fix all tests
2022-08-23 16:19:27 +02:00
AudricV 03d9a4fe9d
Fix typos in YoutubeStreamExtractor.tryDecryptUrl 2022-08-21 21:17:03 +02:00
litetex 9e93d6b193 YoutubeThrottlingDecrypter: Check if returned string is a valid JavaScript function 2022-08-21 20:16:45 +02:00
litetex ed0a07af4e YoutubeThrottlingDecrypter: Patch regex 2022-08-21 20:15:53 +02:00
litetex 8ff7a90f52 Improved consent cookie related constants and documentation 2022-08-21 18:41:40 +02:00
AudricV cb64a480cd
[YouTube] Remove deprecated code in YoutubeThrottlingDecrypter
YoutubeThrottlingDecrypter is now a non-instantiable final class and non-static
attributes have been removed.

The static attributes of this class have been renamed, in order to respect the
naming specification used.
2022-08-16 15:01:05 +02:00
litetex c1a72b8240 Use Android Studios code style 2022-08-14 14:48:29 +02:00
litetex fc27b8a5b8 Use preexisting ``ContentNotAvailableException`` 2022-08-14 14:48:29 +02:00
litetex ecfc370685 Fixed all YTMixPlaylists
Added option to choose if you want to consent or not - currently this is done by a static variable in ``YoutubeParsingHelper`` - may not be the best long-term solution but for now the tests work again (in EU countries) 🥳
2022-08-14 14:48:27 +02:00
litetex 2b6fe294b2 Fixed ``YoutubeMusicSearchExtractorTest``
The renderers and queries got changed.
2022-08-14 14:48:27 +02:00
AudricV 7bdca33a87
[YouTube] Ensure that an additional player response is the correct one
If YouTube detect that requests come from a third party client, they may
replace the real player response by another one of a video saying that this
content is not available on this app and to watch it on the latest version of
YouTube. We can detect this by checking whether the video ID of the player
response returned is the same as the one requested by the extractor.
2022-08-12 19:20:31 +02:00
AudricV c82317e318
[YouTube] Spoof more mobile clients
Additional parameters have been added to the player requests of ANDROID and IOS
clients:

- for both clients: osName and osVersion: their respective values are:
  - for the ANDROID one: Android and 12;
  - for the IOS one: iOS and 15.6.0.19G71.
- for the ANDROID client: androidTargetSdkVersion, with the Android SDK version
  corresponding to the Android version used in the player requests of this
  client. This parameter is now required with this client to be sure to get a
  correct player response, otherwise, the one of a video saying that this
  content is not available in this app and to watch it with the latest version
  of YouTube can be returned instead;
- for the IOS client: deviceMake, with Apple as its value.

The iOS version sent in the IOS client player requests has been also updated to
the version 15.6 of the OS.

Finally, a comment about the requirement to use the signature timestamp from
the player JavaScript base file for HTML5 player requests on videos with
obfuscated URLs has been added and replaces a previous one which may be not
true.
2022-08-12 19:20:31 +02:00
AudricV d0549a5a52
[YouTube] Update client versions and use a real version for the iOS client
The iOS version can be got easily in fact, by looking at the What's New section of the App Store' app page.
2022-08-12 19:20:31 +02:00
AudricV d7e678aca2
[YouTube] Improve WEB client version and API key HTML extraction
Common code in WEB client version HTML extraction has been deduplicated, usage of the Java 8 Stream API has been made and initial data fallback has been used as a last resort.
This means that the client version extraction from regexes will be used before this fallback, as it doesn't contain the full client version.
This can be used as a way to fingerprint the extractor, even if it seems to be not the case.
2022-08-12 19:20:30 +02:00
AudricV 5b548340e8
[YouTube] Catch any exception in YoutubeThrottlingDecrypter.apply and improve docs
This will prevent any future extractor break due to decryption failure, like it was excepted to be the case before.

Some documentation about the throttling decryption has been also improved.
2022-08-12 17:49:36 +02:00
ThetaDev 52ded6e3d7
Handle curly braces inside strings in StringUtils.matchToClosingParenthesis
This is required to extract fully more complex YouTube nsig functions.
2022-08-12 16:32:00 +02:00
Stypox d12003651b
Merge pull request #886 from Isira-Seneviratne/toArray_improvements
Make improvements to methods using toArray().
2022-08-06 22:34:23 +02:00
Isira Seneviratne 7daca10a06 Make improvements to methods using toArray(). 2022-08-06 05:21:12 +05:30
Isira Seneviratne 64771c5712 Use String.join() and Collectors.joining(). 2022-08-04 05:18:13 +05:30
Stypox fc8b5ebbc6
Merge pull request #878 from Isira-Seneviratne/Use_Collections
Use Collections methods.
2022-08-03 22:50:41 +02:00
Isira Seneviratne 1af6b8eedb Use Collections.singletonList(). 2022-07-27 07:35:57 +05:30
Isira Seneviratne ff60e05c76 Use Collections.singletonMap(). 2022-07-27 07:35:57 +05:30
Isira Seneviratne 682a4263e5 Use Objects.requireNonNull(). 2022-07-27 06:55:26 +05:30
Stypox 5ab74b3631
Merge pull request #857 from FireMasterK/video-title
Get original untranslated title for YouTube
2022-07-06 10:26:45 +02:00
AudricV 090debd83b
[YouTube] Fetch the ANDROID client for ended/post livestreams
The ANDROID client was only fetched for video contents, where it can be useful on ended/post livestreams, if the n parameter of the WEB client cannot be decrypted, to avoid throttling issues (because the WEB client was only used before for ended/post livestreams).

It also provides an exclusive 48kbps M4A audio format in the adaptiveFormats array of the JSON player response, like other mobile clients (which can be also extracted from the response of the DASH manifest URL returned into the WEB client player's response, but the DASH manifest is not used by the extractor).

A note about non-fatality of fetching or parsing issues of the ANDROID and IOS clients has been added.
2022-06-21 18:53:49 +02:00
litetex f775155d25
Merge pull request #846 from litetex/remove-unused-methods
Remove unused methods
2022-06-19 15:12:15 +02:00
AudricV 301a795ed3
[SoundCloud] Remove completely workaround for HLS streams
SoundCloud is currently removing this workaround completely, so there is no need to keep it, because it impacts the loading time (a HLS playlist was downloaded and parsed).
2022-06-16 12:12:54 +02:00
AudricV e960a417ec
[YouTube] Fix extraction of fps, audioSampleRate and audioChannels fields for ItagItems of live streams and post live streams
These values were only set before for video streams.

A fallback for the audio channels count has been added, in order to prevent exceptions when generating DASH manifests of audio streams: the fallback value is 2, because most audio streams on YouTube have 2 audio channels.
2022-06-16 12:12:54 +02:00
Kavin 7635aeed2c
Get original untranslated title for YouTube. 2022-06-02 09:57:52 +01:00
TiA4f8R 287d1dfd63
[SoundCloud] Use the HLS delivery method for all streams and extract only a single stream URL from HLS manifest for MP3 streams
SoundCloud broke the workaround used to get a single file from HLS manifests for Opus manifests, but it still works for MP3 ones.

The code has been adapted to prevent an unneeded request (the one to the Opus HLS manifest) and the HLS delivery method is now used for SoundCloud MP3 and Opus streams, plus the progressive one (for tracks which have a progressive stream (MP3) and for the ones which doesn't have one, it is still used by trying to get a progressive stream, using the workaround).

Streams extraction has been also moved to Java 8 Stream's API and the relevant test has been also updated.
2022-05-29 19:08:18 +02:00
Stypox b3c620f0d8
Apply code review and Streams rework 2022-05-28 12:00:58 +02:00
Stypox d652e05874
[MediaCCC] Fix comments about containsSimilarStream 2022-05-28 12:00:58 +02:00
Stypox 044639c32b
Solve some review comments 2022-05-28 12:00:57 +02:00
litetex c33d392958
Fixed typo XEE → XXE (Xml eXternal Entity attack)
See also:
https://en.wikipedia.org/wiki/XML_external_entity_attack
https://owasp.org/www-community/vulnerabilities/XML_External_Entity_(XXE)_Processing
2022-05-28 12:00:57 +02:00
TiA4f8R fffbbee7f3
[YouTube] Add missing Nonnull annotations in getCache method of YouTube DASH manifest creators 2022-05-28 12:00:56 +02:00
TiA4f8R f7b1515290
[YouTube] Refactor DASH manifests creation
Move DASH manifests creation into a new subpackage of the YouTube package, dashmanifestcreators.
This subpackage contains:

- CreationException, exception extending Java's RuntimeException, thrown by manifest creators when something goes wrong;
- YoutubeDashManifestCreatorsUtils, class which contains all common methods and constants of all or a part of the manifest creators;
- a manifest creator has been added per delivery type of YouTube streams:
  - YoutubeProgressiveDashManifestCreator, for progressive streams;
  - YoutubeOtfDashManifestCreator, for OTF streams;
  - YoutubePostLiveStreamDvrDashManifestCreator, for post-live DVR streams (which use the live delivery method).

Every DASH manifest creator has a getCache() static method, which returns the ManifestCreatorCache instance used to cache results.

DeliveryType has been also extracted from the YouTube DASH manifest creators part of the extractor and moved to the YouTube package.

YoutubeDashManifestCreatorTest has been updated and renamed to YoutubeDashManifestCreatorsTest, and YoutubeDashManifestCreator has been removed.

Finally, several documentation and exception messages fixes and improvements have been made.
2022-05-28 12:00:56 +02:00
TiA4f8R f17f7b9842
Apply requested changes in YoutubeParsingHelper 2022-05-28 12:00:55 +02:00
TiA4f8R 301b9fa024
Remove hashCode and equals methods overrides of Stream classes 2022-05-28 12:00:55 +02:00
TiA4f8R 2f3920c648
[YouTube] Return approxDurationMs value from YouTube's player response in ItagItems generated
This change allows to build DASH manifests using YoutubeDashManifestCreator with the real duration of streams and prevent potential cuts of the end of progressive streams, because the duration in YouTube's player response is in seconds and not milliseconds.
2022-05-28 12:00:54 +02:00
TiA4f8R 4158fc46a0
[Bandcamp] Fix regression of Opus radio streams extraction
When moving opus-lo into a constant, opus-lo was renamed to opus_lo and was only used if no MP3 stream was available (which was not the case before the changes in BandcampRadioStreamExtractor related to the addition of the support of all delivery methods), so these changes removed the ability to get Opus streams of Bandcamp radios.

This commit reverts this unwanted change.
2022-05-28 12:00:54 +02:00
TiA4f8R 54d323c2ae
Fix Checkstyle issue in YoutubeDashManifestCreator 2022-05-28 12:00:53 +02:00
Stypox 2321822844
Rename Stream's baseUrl to manifestUrl 2022-05-28 12:00:53 +02:00
Stypox cfc13f4a6f
[YouTube] Reduce exception generation code and move several attributes of MPD documents into constants
Rename YoutubeDashManifestCreationException to CreationException.
Also use these constants in YoutubeDashManifestCreatorTest.
2022-05-28 12:00:53 +02:00
Stypox 00bbe5eb4d
[YouTube] Make exception messages smaller 2022-05-28 12:00:52 +02:00
Stypox 4da05afe11
[YouTube] Inline collectSegmentsData in YoutubeDashManifestCreator 2022-05-28 12:00:52 +02:00
Stypox 3708ab9ed5
[YouTube] Refactor YoutubeDashManifestCreator
- Remove all of the methods used to access caches and replace them with three caches getters
- Rename caches to shorter and more meaningful names
- Remove redundant @throws tags that just say "if this method fails to do what it should do", which is obvious
2022-05-28 12:00:51 +02:00
Stypox 5c83409039
[YouTube] Rewrite manifest test and rename long methods 2022-05-28 12:00:51 +02:00
Stypox 8226fd044f
[YouTube] Suppress Sonar security warning for XEE 2022-05-28 12:00:50 +02:00
Stypox ba68b8c014
[YouTube] Secure DashManifestCreator against XEE attack
Also remove duplicate lines from javadoc comments, otherwise checkstyle complaints.
XEE prevention is inspired from https://github.com/ChuckerTeam/chucker/pull/201/files
For an explanation of XEE see https://rules.sonarsource.com/java/RSPEC-2755, though the solution reported there causes crashes on Android
2022-05-28 12:00:50 +02:00
Stypox 159d05c91b
Remove unused DashMpdParser (but kept in git history) 2022-05-28 12:00:49 +02:00
Stypox 50272db946
Apply reviews: improve comments, remove FILE, remove Stream#equals(Stream) 2022-05-28 12:00:49 +02:00
TiA4f8R 07b045f20d
[YouTube] Support the iOS client in YoutubeDashManifestCreator and decrypt again the n parameter, if present, for all clients
This commits reverts a new behavior introduced in this branch, which only applied the decryption if needed on streams from the WEB client.
Also fix rebase issues and documentations style in YoutubeDashManifestCreator.
2022-05-28 12:00:48 +02:00
TiA4f8R d64d7bbd01
Move ManifestCreatorCache tests to a separate class and remove override of equals and hashCode methods in ManifestCreatorCache
These methods don't need to be overriden, as they are not excepted to be used in collections.
Also improve the toString method of this class, which contains also now clearFactor and maximumSize attributes and for each operations.
2022-05-28 12:00:48 +02:00
TiA4f8R 2fb1a412a6
Fix Checkstyle issues, revert resolution string changes for YouTube video streams and don't return the rn parameter in DASH manifests
This parameter is still used to get the initialization sequence of OTF and POST-live streams, but is not returned anymore in the manifests.
It has been removed in order to avoid fingerprinting based on the number sent (e.g. when starting to play a stream close to the end and using 123 as the request number where it should be 1) and should be added dynamically by clients in their requests.
The relevant test has been also updated.

Checkstyle issues in YoutubeDashManifestCreator have been fixed, and the changes in the resolution string returned for video streams in YoutubeStreamExtractor have been reverted, as they create issues on NewPipe right now.
2022-05-28 12:00:47 +02:00
TiA4f8R f61e2092a1
[YouTube] Return a copy of the hardcoded ItagItem instead of returning the reference to the hardcoded one in ItagItem.getItag
To do so, a copy constructor has been added in the class.

This fixes, for instance, an issue in NewPipe, in which the ItagItem values where not the ones corresponsing to a stream but to another, when generating DASH manifests.
2022-05-28 12:00:47 +02:00
TiA4f8R aa4c10e751
Improve documentation and adress most of the requested changes
Also fix some issues in several places, in the code and the documentation.
2022-05-28 12:00:46 +02:00
TiA4f8R 7477ed0f3d
[YouTube] Add ability to generate manifests of progressive, OTF and post live streams
A new class has been added to do so: YoutubeDashManifestCreator.
It relies on a new class: ManifestCreatorCache, to cache the content, which relies on a new pair class named Pair.
Results are cached and there is a cache per delivery type, on which cache limit, clear factor, clearing and resetting can be applied to each cache and to all caches.
Look at code changes for more details.
2022-05-28 12:00:45 +02:00
TiA4f8R a857684442
Apply changes in YoutubeStreamExtractor
Extract post live DVR streams as post live streams instead of live streams.

A new class has been in order to improve code: ItagInfo, which stores an itag, the content (URL) extracted and if its an URL or not.
A functional interface has been added in order to abstract the stream building: StreamBuilderHelper.
Also add the cver parameter added by the desktop web client on the corresponding streams (a new method has been added in YoutubeParsingHelper to check this and another for Android streams).

Some code in these classes has been also refactored/improved/optimized.
2022-05-28 12:00:44 +02:00
TiA4f8R 4330b5f7be
Add POST_LIVE_STREAM and POST_LIVE_AUDIO_STREAM stream types
This allows the extractor to determine if a content is an ended audio or video livestream.
2022-05-28 12:00:43 +02:00
TiA4f8R 881969f1da
Apply changes in all StreamExtractors except YouTube's one and fix extraction of PeerTube audio streams as video streams
Some code in these classes has been also refactored/improved/optimized.
Also fix the extraction of PeerTube audio streams as video streams, which are now returned as audio streams.
2022-05-28 12:00:43 +02:00
TiA4f8R d5f3637fc3
[YouTube] Return more values returned inside the ItagItems of the player response and deprecate use of public audio and video fields
These fields can be now replaced by a getter and a setter.

New fields have been added and will allow the creation of DASH manifests for OTF and ended livestreams. There are:
- contentLength;
- approxDurationMs;
- targetDurationSec;
- sampleRate;
- audioChannels.
2022-05-28 12:00:42 +02:00
TiA4f8R 7c67d46e09
Move DashMpdParser to the YouTube package and fix extraction of streams
DashMpdParser is only working with YouTube streams, as it uses the ItagItem class.
Also update creation of AudioStreams and VideoStreams objects.
2022-05-28 12:00:41 +02:00
TiA4f8R ad993b920f
Remove fetching of the DASH manifest extracted when getting information of a content with StreamInfo
DashMpdParser is only working with YouTube streams, as it uses the ItagItem class.

Also improve code and comments of StreamInfo (especially final use where possible).
2022-05-28 12:00:41 +02:00
TiA4f8R 2f061b8dbd
Add support of other delivery methods than progressive HTTP in Stream classes
Stream constructors are now private and streams can be constructed with new Builder classes per stream class. This change has been made to prevent creating and using several constructors in stream classes.

Some default cases have been also added in these Builder classes, so not everything has to be set, depending of the service and the content.
2022-05-28 12:00:27 +02:00
TiA4f8R c34b5e3a8b
[YouTube] Fix extraction of YouTube Music client version and API key when using YouTube Music's website in EU
Google returns now the consent page of YouTube for YouTube Music in EU, which can be also avoided by adding the ucbcb parameter to the URL with the value 1 ("?ucbcb=1").
2022-05-15 11:20:06 +02:00
litetex 2015eb374a Removed more unused methods 2022-05-09 21:05:03 +02:00
litetex f69b0ff77b Remove unused methods 2022-05-09 20:59:25 +02:00
TiA4f8R 3c3cd78676
Remove Checkstyle suppressions file and fix Checkstyle issues introduced in 24e8399 and 8c1041d
The Checkstyle suppressions file is now replaced by // CHECKSTYLE:OFF and // CHECKSTYLE:ON comments.
2022-05-02 21:51:25 +02:00
Stypox 2e1c5c119d
Merge pull request #822 from Stypox/more-refactors
More refactors
2022-05-02 19:03:54 +02:00
Stypox 598ebb92ea
Merge pull request #839 from TeamNewPipe/bandcamp/extract-length
Bandcamp: extract stream length
2022-05-02 15:49:41 +02:00
litetex 5db4d1faf3
Merge pull request #782 from litetex/cleanup-yt-stream-extractor
Cleanup of ``YoutubeStreamExtractor`` and some related classes
2022-05-01 16:44:11 +02:00
litetex fe30eb43a9 Cleanup ``YoutubeStreamExtractor`` and some related classes
* Fixed obvious sonar(lint) warnings
* Abstracted some code (get*Streams)
* Used some new lines to make code better readable
* Chopped down brace-jungle in some methods
* Use StandardCharset (Java 8 4tw)
2022-05-01 16:39:07 +02:00
Stypox c2b5370517
Apply suggestions: improve switch and use EMPTY_STRING 2022-04-30 16:39:51 +02:00
Stypox 7c78c39230
Merge pull request #821 from litetex/cleanup-TimeAgoParser-java
Cleanup ``TimeAgoParser``
2022-04-30 16:20:09 +02:00
TiA4f8R 9f9af35adb
[YouTube] Fix regression introduced in the order of streams used when adding more parameters to InnerTube requests, using the iOS client for livestreams and more 2022-04-25 20:23:04 +02:00
Fynn Godau c38c016de5 Bandcamp: extract stream length 2022-04-24 21:24:19 +02:00
Stypox 52fa2d939a
Fix javadoc formatting error causing deployment to fail 2022-04-16 17:07:07 +02:00
Stypox dcb7483dcf
Fix YouTube throttling decrypter function parsing 2022-04-15 13:10:19 +02:00