Commit Graph

139 Commits

Author SHA1 Message Date
ThetaDev c156c404cb Merge branch 'dev' of github.com:TeamNewPipe/NewPipeExtractor into channel-tabs 2022-11-29 17:50:32 +01:00
ThetaDev ffd02a4bc8 fix: shorts continuation 2022-11-29 17:50:14 +01:00
Kavin 52fda37915
Implement bold/italic/strike-through support. 2022-11-28 19:06:18 +00:00
ThetaDev f7e3b713b5 Merge branch 'dev' into channel-tabs 2022-11-22 02:38:03 +01:00
ThetaDev 8d3bc2bc4b fix: YoutubeParsingHelper formatting 2022-11-22 01:59:51 +01:00
Tobi 2211a24b69
Merge pull request #971 from lrusso96/patch-1
[YouTube] Improve duration parsing
2022-11-16 16:14:54 +01:00
Isira Seneviratne ddbce3b83d Add Utils methods for URL encoding/decoding using UTF-8. 2022-11-12 07:29:15 +05:30
Isira Seneviratne 366f5c1632 Use StandardCharsets.UTF_8. 2022-11-12 07:29:15 +05:30
Luigi Russo c9635218e2
[YouTube] Improve duration parsing 2022-11-09 09:41:29 +01:00
Isira Seneviratne 316d8573fa Use immutable sets in YoutubeParsingHelper. 2022-11-07 07:50:26 +05:30
ThetaDev 73c182f817 Merge branch 'dev' of github.com:TeamNewPipe/NewPipeExtractor into channel-tabs 2022-11-04 23:50:04 +01:00
ThetaDev f71fdac166 refactor: API changes 2022-11-04 23:47:44 +01:00
ThetaDev 592e1d6386 fix: parsing attributed description with no command runs 2022-11-03 12:10:52 +01:00
ThetaDev 099b53cc4f
[YouTube] Add parser for attributedDescription
Also update the mock of the next InnerTube endpoint response of the
YoutubeStreamExtractorDefaultTest.DescriptionTestUnboxing test class with an
attributedDescription instead of a regular description
2022-11-02 23:11:33 +01:00
Kavin 6a256d0631
Add uploader url and verified to PlaylistInfoItem. 2022-10-30 13:00:19 +00:00
ThetaDev 12537733c1 fix: store YouTube visitor data for channel tabs 2022-10-25 09:20:18 +02:00
ThetaDev 57865e2195 feat: add visitor data config option 2022-10-23 21:57:15 +02:00
ThetaDev 8b4b4310ea feat: add tab support to channel extractor
- extract YouTube channel tabs: playlists, channels, shorts, live
2022-10-22 15:29:35 +02:00
Isira Seneviratne 943b7c033b Remove EMPTY_STRING. 2022-08-24 06:59:17 +05:30
litetex 8ff7a90f52 Improved consent cookie related constants and documentation 2022-08-21 18:41:40 +02:00
litetex ecfc370685 Fixed all YTMixPlaylists
Added option to choose if you want to consent or not - currently this is done by a static variable in ``YoutubeParsingHelper`` - may not be the best long-term solution but for now the tests work again (in EU countries) 🥳
2022-08-14 14:48:27 +02:00
AudricV c82317e318
[YouTube] Spoof more mobile clients
Additional parameters have been added to the player requests of ANDROID and IOS
clients:

- for both clients: osName and osVersion: their respective values are:
  - for the ANDROID one: Android and 12;
  - for the IOS one: iOS and 15.6.0.19G71.
- for the ANDROID client: androidTargetSdkVersion, with the Android SDK version
  corresponding to the Android version used in the player requests of this
  client. This parameter is now required with this client to be sure to get a
  correct player response, otherwise, the one of a video saying that this
  content is not available in this app and to watch it with the latest version
  of YouTube can be returned instead;
- for the IOS client: deviceMake, with Apple as its value.

The iOS version sent in the IOS client player requests has been also updated to
the version 15.6 of the OS.

Finally, a comment about the requirement to use the signature timestamp from
the player JavaScript base file for HTML5 player requests on videos with
obfuscated URLs has been added and replaces a previous one which may be not
true.
2022-08-12 19:20:31 +02:00
AudricV d0549a5a52
[YouTube] Update client versions and use a real version for the iOS client
The iOS version can be got easily in fact, by looking at the What's New section of the App Store' app page.
2022-08-12 19:20:31 +02:00
AudricV d7e678aca2
[YouTube] Improve WEB client version and API key HTML extraction
Common code in WEB client version HTML extraction has been deduplicated, usage of the Java 8 Stream API has been made and initial data fallback has been used as a last resort.
This means that the client version extraction from regexes will be used before this fallback, as it doesn't contain the full client version.
This can be used as a way to fingerprint the extractor, even if it seems to be not the case.
2022-08-12 19:20:30 +02:00
Isira Seneviratne 1af6b8eedb Use Collections.singletonList(). 2022-07-27 07:35:57 +05:30
Isira Seneviratne ff60e05c76 Use Collections.singletonMap(). 2022-07-27 07:35:57 +05:30
TiA4f8R f17f7b9842
Apply requested changes in YoutubeParsingHelper 2022-05-28 12:00:55 +02:00
Stypox 50272db946
Apply reviews: improve comments, remove FILE, remove Stream#equals(Stream) 2022-05-28 12:00:49 +02:00
TiA4f8R aa4c10e751
Improve documentation and adress most of the requested changes
Also fix some issues in several places, in the code and the documentation.
2022-05-28 12:00:46 +02:00
TiA4f8R a857684442
Apply changes in YoutubeStreamExtractor
Extract post live DVR streams as post live streams instead of live streams.

A new class has been in order to improve code: ItagInfo, which stores an itag, the content (URL) extracted and if its an URL or not.
A functional interface has been added in order to abstract the stream building: StreamBuilderHelper.
Also add the cver parameter added by the desktop web client on the corresponding streams (a new method has been added in YoutubeParsingHelper to check this and another for Android streams).

Some code in these classes has been also refactored/improved/optimized.
2022-05-28 12:00:44 +02:00
TiA4f8R c34b5e3a8b
[YouTube] Fix extraction of YouTube Music client version and API key when using YouTube Music's website in EU
Google returns now the consent page of YouTube for YouTube Music in EU, which can be also avoided by adding the ucbcb parameter to the URL with the value 1 ("?ucbcb=1").
2022-05-15 11:20:06 +02:00
Stypox 2e1c5c119d
Merge pull request #822 from Stypox/more-refactors
More refactors
2022-05-02 19:03:54 +02:00
TiA4f8R 67288a0191
[YouTube] Fix extraction of embeddable age-restricted videos, fix extraction of contents with warnings and more
Use the TV embedded client technique to get streams of embeddable age-restricted videos.

This client doesn't provide the playerMicroFormatRenderer object in the player response, but it is still returned on the WEB player response, even for unavailable (but non-private) contents, so we need now to store it, as we are replacing the player response from the WEB client by the TV embedded one.
Otherwise, some metadata such as the unlisted property, category, the uploadDate and the publishDate properties.

The outdated code for these contents has been removed.

Add the racyCheckOk and contentCheckOk to player and next requests to the InnerTube API.
The first doesn't seem to make any difference when used anonymously, but the second one is needed to get streams of contents with a warning before they can be played.

Also apply some requested changes, fixes and improvements in YoutubeParsingHelper and YoutubeStreamExtractor.
2022-04-02 19:06:36 +02:00
TiA4f8R dfa4239661
Fix missing imports and Checkstyle issues 2022-03-27 22:10:57 +02:00
TiA4f8R 2e3da445e6
[YouTube] Add documentation about parameters added and clients versions and key
Also move the iPhone device machine id to a constant, explain how it is used and move the licence in the header of the file, and fix missing imports in YoutubeStreamExtractor (due to a rebase issue).
2022-03-27 22:10:57 +02:00
TiA4f8R 1dad3bfe8b
[YouTube] Update again hardcoded client versions and update mobile user agents
Also provide ability to get mobile user-agents used for mobile InnerTube requests and deduplicate related code.
2022-03-27 20:52:40 +02:00
TiA4f8R 3d38459cf3
[YouTube] Reduce InnerTube response sizes by adding the prettyPrint parameter with the false value
InnerTube responses return pretty printed responses, which increase responses' size for nothing.

By using the prettyPrint parameter on requests and setting its value to false, responses are not pretty printed anymore, which reduces responses size, and so data transfer and processing times.
This usage has been recently deployed by YouTube on their websites.
2022-03-27 20:52:40 +02:00
litetex 349ba8db7f
Improve tests and randomness
- Use the existing RNG inside YoutubeParsingHelper
- Deduplicated test-setup for YouTube tests
- Minor improvements
2022-03-27 20:52:38 +02:00
TiA4f8R d0d91e6690
Adress requested changes 2022-03-27 20:51:39 +02:00
TiA4f8R b6bc521f0d
[YouTube] Update client versions again 2022-03-27 20:51:38 +02:00
TiA4f8R 26f93f5bb0
[YouTube] Extract streams of livestreams from the iOS client and disabled the Android client for livestreams
The iOS client is only enabled for livestreams and the Android client is now only enabled for videos, both by default.

A way to force, or not, the fetch of both clients have been added with two new static methods in YoutubeStreamExtractor.
2022-03-27 20:51:38 +02:00
TiA4f8R 7d07924de8
[YouTube] Try to use lighter requests when extracting client version and key from YouTube and YouTube Music
This is done by fetching https://www.youtube.com/sw.js for YouTube and https://music.youtube.com/sw.js for YouTube Music.

Two new methods in Utils class have been added which allow to try to get a match of regular expressions in a string array, or a Pattern array, on a content, on a specific index or 0.
Also some code refactoring has been made in this class.
2022-03-27 20:51:38 +02:00
TiA4f8R 05b7fee23b
[YouTube] Add the cpn param to playback requests and try to spoof better the Android client
The cpn param, aka the content playback nonce param, is a parameter sent by YouTube web client in videoplayback requests, and for some of them, in the player request body. This PR adds it everywhere.

For the desktop/WEB client, some params were missing from the playbackContext object, which seemed (or not) to make YouTube throttle streams extracted from the WEB client. This PR adds them.

Fingerprinting on the WEB client basing on the client version used is not possible anymore, because the latest client version is extracted at the first time of a YouTube request on a session which require the extractor to fetch again the website (and this may come back the reCaptcha issues again unfortunately, but it seems there is no other way to get it).

For the Android client, the video id is now also sent as a query parameter, like a 12 characters string, in the t query parameter, in order to spoof better this client. Researches need to be done on this parameter, unique to each request, and how it is generated by clients.

This commit also fixes a small bug with the Android User-Agent string.

Some code improvements have been also made.
2022-03-27 20:51:38 +02:00
TiA4f8R 83f374bff1
[YouTube] Update client versions and fix a bug when using resetClientVersionAndKey method
The boolean keyAndVersionExtracted in YoutubeParsingHelper was not set to false when resetting the client version and the key, which makes the extractor uses null on the next getting of the client version or the key if the clientVersion and the key were extracted before.
Also update client versions.
2022-03-27 20:51:38 +02:00
Stypox adbbdc7a5b
[YouTube] Fix regex warning: use ' {2}' instead of ' ' 2022-03-26 22:07:14 +01:00
litetex 7598b40957 Workaround for incorrect duration for "YT shorts" videos in channels
As a workaround 0 is returned as duration for such videos.
See also https://github.com/TeamNewPipe/NewPipe/issues/8034
2022-03-26 20:52:24 +01:00
Stypox 740a37a2de [YouTube] Fix checkstyle issues 2022-03-26 19:42:40 +01:00
XiangRongLin aa6b7272a4
Merge pull request #804 from Stypox/fix-yt-music-mix
[YouTube] Fix music mixes in some countries
2022-03-20 08:35:56 +01:00
Stypox 401082abe4
[YouTube] Extract playlist type in playlist extractor 2022-03-19 10:48:12 +01:00
Stypox 63ed06a710
[YouTube] Differentiate genre mixes from normal mixes
Note: genre mixes already worked, now they are just considered as such in various video id extraction and in related items
Note 2: now extracting a mix id from a *normal* youtube mix id will fail if the video id wouldn't be exactly 11 characters long
2022-03-19 10:46:31 +01:00