Commit Graph

113 Commits

Author SHA1 Message Date
TiA4f8R f17f7b9842
Apply requested changes in YoutubeParsingHelper 2022-05-28 12:00:55 +02:00
Stypox 50272db946
Apply reviews: improve comments, remove FILE, remove Stream#equals(Stream) 2022-05-28 12:00:49 +02:00
TiA4f8R aa4c10e751
Improve documentation and adress most of the requested changes
Also fix some issues in several places, in the code and the documentation.
2022-05-28 12:00:46 +02:00
TiA4f8R a857684442
Apply changes in YoutubeStreamExtractor
Extract post live DVR streams as post live streams instead of live streams.

A new class has been in order to improve code: ItagInfo, which stores an itag, the content (URL) extracted and if its an URL or not.
A functional interface has been added in order to abstract the stream building: StreamBuilderHelper.
Also add the cver parameter added by the desktop web client on the corresponding streams (a new method has been added in YoutubeParsingHelper to check this and another for Android streams).

Some code in these classes has been also refactored/improved/optimized.
2022-05-28 12:00:44 +02:00
TiA4f8R c34b5e3a8b
[YouTube] Fix extraction of YouTube Music client version and API key when using YouTube Music's website in EU
Google returns now the consent page of YouTube for YouTube Music in EU, which can be also avoided by adding the ucbcb parameter to the URL with the value 1 ("?ucbcb=1").
2022-05-15 11:20:06 +02:00
Stypox 2e1c5c119d
Merge pull request #822 from Stypox/more-refactors
More refactors
2022-05-02 19:03:54 +02:00
TiA4f8R 67288a0191
[YouTube] Fix extraction of embeddable age-restricted videos, fix extraction of contents with warnings and more
Use the TV embedded client technique to get streams of embeddable age-restricted videos.

This client doesn't provide the playerMicroFormatRenderer object in the player response, but it is still returned on the WEB player response, even for unavailable (but non-private) contents, so we need now to store it, as we are replacing the player response from the WEB client by the TV embedded one.
Otherwise, some metadata such as the unlisted property, category, the uploadDate and the publishDate properties.

The outdated code for these contents has been removed.

Add the racyCheckOk and contentCheckOk to player and next requests to the InnerTube API.
The first doesn't seem to make any difference when used anonymously, but the second one is needed to get streams of contents with a warning before they can be played.

Also apply some requested changes, fixes and improvements in YoutubeParsingHelper and YoutubeStreamExtractor.
2022-04-02 19:06:36 +02:00
TiA4f8R dfa4239661
Fix missing imports and Checkstyle issues 2022-03-27 22:10:57 +02:00
TiA4f8R 2e3da445e6
[YouTube] Add documentation about parameters added and clients versions and key
Also move the iPhone device machine id to a constant, explain how it is used and move the licence in the header of the file, and fix missing imports in YoutubeStreamExtractor (due to a rebase issue).
2022-03-27 22:10:57 +02:00
TiA4f8R 1dad3bfe8b
[YouTube] Update again hardcoded client versions and update mobile user agents
Also provide ability to get mobile user-agents used for mobile InnerTube requests and deduplicate related code.
2022-03-27 20:52:40 +02:00
TiA4f8R 3d38459cf3
[YouTube] Reduce InnerTube response sizes by adding the prettyPrint parameter with the false value
InnerTube responses return pretty printed responses, which increase responses' size for nothing.

By using the prettyPrint parameter on requests and setting its value to false, responses are not pretty printed anymore, which reduces responses size, and so data transfer and processing times.
This usage has been recently deployed by YouTube on their websites.
2022-03-27 20:52:40 +02:00
litetex 349ba8db7f
Improve tests and randomness
- Use the existing RNG inside YoutubeParsingHelper
- Deduplicated test-setup for YouTube tests
- Minor improvements
2022-03-27 20:52:38 +02:00
TiA4f8R d0d91e6690
Adress requested changes 2022-03-27 20:51:39 +02:00
TiA4f8R b6bc521f0d
[YouTube] Update client versions again 2022-03-27 20:51:38 +02:00
TiA4f8R 26f93f5bb0
[YouTube] Extract streams of livestreams from the iOS client and disabled the Android client for livestreams
The iOS client is only enabled for livestreams and the Android client is now only enabled for videos, both by default.

A way to force, or not, the fetch of both clients have been added with two new static methods in YoutubeStreamExtractor.
2022-03-27 20:51:38 +02:00
TiA4f8R 7d07924de8
[YouTube] Try to use lighter requests when extracting client version and key from YouTube and YouTube Music
This is done by fetching https://www.youtube.com/sw.js for YouTube and https://music.youtube.com/sw.js for YouTube Music.

Two new methods in Utils class have been added which allow to try to get a match of regular expressions in a string array, or a Pattern array, on a content, on a specific index or 0.
Also some code refactoring has been made in this class.
2022-03-27 20:51:38 +02:00
TiA4f8R 05b7fee23b
[YouTube] Add the cpn param to playback requests and try to spoof better the Android client
The cpn param, aka the content playback nonce param, is a parameter sent by YouTube web client in videoplayback requests, and for some of them, in the player request body. This PR adds it everywhere.

For the desktop/WEB client, some params were missing from the playbackContext object, which seemed (or not) to make YouTube throttle streams extracted from the WEB client. This PR adds them.

Fingerprinting on the WEB client basing on the client version used is not possible anymore, because the latest client version is extracted at the first time of a YouTube request on a session which require the extractor to fetch again the website (and this may come back the reCaptcha issues again unfortunately, but it seems there is no other way to get it).

For the Android client, the video id is now also sent as a query parameter, like a 12 characters string, in the t query parameter, in order to spoof better this client. Researches need to be done on this parameter, unique to each request, and how it is generated by clients.

This commit also fixes a small bug with the Android User-Agent string.

Some code improvements have been also made.
2022-03-27 20:51:38 +02:00
TiA4f8R 83f374bff1
[YouTube] Update client versions and fix a bug when using resetClientVersionAndKey method
The boolean keyAndVersionExtracted in YoutubeParsingHelper was not set to false when resetting the client version and the key, which makes the extractor uses null on the next getting of the client version or the key if the clientVersion and the key were extracted before.
Also update client versions.
2022-03-27 20:51:38 +02:00
Stypox adbbdc7a5b
[YouTube] Fix regex warning: use ' {2}' instead of ' ' 2022-03-26 22:07:14 +01:00
litetex 7598b40957 Workaround for incorrect duration for "YT shorts" videos in channels
As a workaround 0 is returned as duration for such videos.
See also https://github.com/TeamNewPipe/NewPipe/issues/8034
2022-03-26 20:52:24 +01:00
Stypox 740a37a2de [YouTube] Fix checkstyle issues 2022-03-26 19:42:40 +01:00
XiangRongLin aa6b7272a4
Merge pull request #804 from Stypox/fix-yt-music-mix
[YouTube] Fix music mixes in some countries
2022-03-20 08:35:56 +01:00
Stypox 401082abe4
[YouTube] Extract playlist type in playlist extractor 2022-03-19 10:48:12 +01:00
Stypox 63ed06a710
[YouTube] Differentiate genre mixes from normal mixes
Note: genre mixes already worked, now they are just considered as such in various video id extraction and in related items
Note 2: now extracting a mix id from a *normal* youtube mix id will fail if the video id wouldn't be exactly 11 characters long
2022-03-19 10:46:31 +01:00
Stypox 8f9d5b858e
[YouTube] Remove useless comments about mixes 2022-03-19 10:44:06 +01:00
Stypox 50db871d89
[YouTube] Extract mixes from streams related items 2022-03-19 10:44:06 +01:00
Stypox dd8687f9fe
[YouTube] Fix music mixes in some countries 2022-03-01 23:02:56 +01:00
mhmdanas 3e8e2a1532 Add support for y2u.be links 2021-10-22 22:48:18 +03:00
magicfelix 0e16091ce0
Add invidious instance EduVid Tubus 2021-08-12 10:06:41 +02:00
FireMasterK 2eeb0a3403
Rebase + some code improvements + fix extraction of age-restricted videos + update clients version
Here is now the requests which will be made by the `onFetchPage` method of `YoutubeStreamExtractor`:

- the desktop API is fetched.

If there is no streaming data, the desktop player API with the embed client screen will be fetched (and also the player code), then the Android mobile API.
- if there is no streaming data, a `ContentNotAvailableException` will be thrown by using the message provided in playability status

If the video is age restricted, a request to the next endpoint of the desktop player with the embed client screen will be sent.
Otherwise, the next endpoint will be fetched normally, if the content is available.

If the video is not age-restricted, a request to the player endpoint of the Android mobile API will be made.

We can get more streams by using the Android mobile API but some streams may be not available on this API, so the streaming data of the Android mobile API will be first used to get itags and then the streaming data of the desktop internal API will be used.
If the parsing of the Android mobile API went wrong, only the streams of the desktop API will be used.

Other code changes:

- `prepareJsonBuilder` in `YoutubeParsingHelper` was renamed to `prepareDesktopJsonBuilder`
- `prepareMobileJsonBuilder` in `YoutubeParsingHelper` was renamed to `prepareAndroidMobileJsonBuilder`
- two new methods in `YoutubeParsingHelper` were added: `prepareDesktopEmbedVideoJsonBuilder` and `prepareAndroidMobileEmbedVideoJsonBuilder`
- `createPlayerBodyWithSts` is now public and was moved to `YoutubeParsingHelper`
- a new method in `YoutubeJavaScriptExtractor` was added: `resetJavaScriptCode`, which was needed for the method `resetDebofuscationCode` of `YoutubeStreamExtractor`
- `areHardcodedClientVersionAndKeyValid` in `YoutubeParsingHelper` returns now a `boolean` instead of an `Optional<Boolean>`
- the `fetchVideoInfoPage` method of `YoutubeStreamExtractor` was removed because YouTube returns now 404 for every client with the `get_video_info` page
- some unused objects and some warnings in `YoutubeStreamExtractor` were removed and fixed

Co-authored-by: TiA4f8R <74829229+TiA4f8R@users.noreply.github.com>
2021-08-01 12:39:03 +02:00
TiA4f8R 7753556e66
Adress the last requested changes + update YoutubeCommentsExtractor mocks 2021-08-01 12:39:03 +02:00
TiA4f8R 8aa60d7e8f
Update clients version 2021-08-01 12:39:01 +02:00
TiA4f8R 609919db59
Adress again reviews, fix some rebase issues 2021-08-01 12:39:00 +02:00
TiA4f8R 4299d806a2
Adress changes 2021-08-01 12:38:59 +02:00
TiA4f8R 1a6b8da438
Annotate YoutubeParsingHelper methods with Nonnull when needed 2021-08-01 12:38:59 +02:00
TiA4f8R a6a2c6eb80
Revert the use of Collections.singletonList instead of Arrays.asList in addCookieHeader of YoutubeParsingHelper 2021-08-01 12:38:59 +02:00
TiA4f8R 632772d17f
Adress requested changes in YoutubeParsingHelper 2021-08-01 12:38:58 +02:00
TiA4f8R 657f165771
Update client version and mocks 2021-08-01 12:38:44 +02:00
TiA4f8R 34a9ccb0fd
Adress requested changes 2021-08-01 12:38:42 +02:00
TiA4f8R 54d4551ca6
Adress requested changes in YoutubeParsingHelper and update mobile client version 2021-08-01 12:38:42 +02:00
TiA4f8R 70927ddade
Update client version and mocks 2021-08-01 12:38:40 +02:00
TiA4f8R 0f9e9b8b4b
Use the youtubei API for YouTube mixes + update the corresponding test + do some improvements
Use the youtubei API for YouTube mixes. The corresponding has been updated because the new API breaks the tests of YoutubeMixPlaylistExtractorTest.
Remove some deprecated code (the old search code with the pbj JSON) and do some other improvements.
2021-08-01 12:38:37 +02:00
TiA4f8R 013b902535
Use the Android mobile API when there are OTF streams or the content is protected by signatureCiphers
Use the Android mobile API to get the itag 22 (720p with audio), removed when the content is protected by signatureCiphers.
Also use this API when they are OTF streams, to get the itag 17 and 36, low 3GPP quality streams but also the itag 139.
Update the web client version.
2021-08-01 12:38:36 +02:00
TiA4f8R e7d589edbf
Use the youtubei API for YouTube videos + update client version
Update the hardcoded client version to 2.20210520.09.00
Use the player and next endpoints of the Innertube API for YouTube videos
2021-08-01 12:38:36 +02:00
TiA4f8R f73c923f60
Don't use the youtubei.googleapis.com but the websites domains + update client version of the desktop internal API
Use again www.youtube.com and music.youtube.com domains instead of youtubei.googleapis.com domain because it spoofs more a web client of YouTube or YouTube Music and may reduce Google's detection of NewPipe Extractor users.
2021-08-01 12:38:34 +02:00
TiA4f8R 4d682834c3
Fix localization and update client version 2021-08-01 12:38:03 +02:00
TiA4f8R f46cfb0f26
Adress reviews and do some improvements
Adress changes requested in reviews.
Do some improvements, remove unused imports and format some code to be in the 100 characters line limit.
2021-08-01 12:38:03 +02:00
TiA4f8R e075dd5a63
Update client version, fix some tests, update mocks and do some improvements
Add the origin and the referer headers with the https://www.youtube.com value for YouTube JSON POST requests.
Don't add the consent cookie header for the requests which use the youtubei/innertube API because it's uneeded.
Fix some tests and update YouTube mocks
2021-08-01 12:38:02 +02:00
TiA4f8R b49ae547a3
Do some improvements to YoutubeStreamExtractor
Get the real name of the uploader (for autogenerated channels and music artist channels), like before the migration to the JSON pbj.
Do some other improvements, especially reformatting some code to be in the 100 characters line limit and use final where possible.
2021-08-01 12:38:01 +02:00
TiA4f8R 991b2c7d73
Use lightweight requests when getting and checking YouTube API key and client version 2021-08-01 12:38:01 +02:00