Commit Graph

158 Commits

Author SHA1 Message Date
Stypox 9d0dd36034
[YouTube] Create constants for client names/versions 2024-04-20 11:43:54 +02:00
AudricV 27dc1b1f50
[YouTube] Remove usage of API keys for InnerTube requests, bump versions
The API keys are not used anymore by official clients in almost all cases
(still used by the Android app until it gets a configuration) for all requests
we made.

Clients and device OS versions have been bumped to their latest stable version
known.

Methods and fields related to API keys have been renamed or deleted if they're
no longer relevant.
2024-04-10 21:19:02 +02:00
Stypox 09732d6785
[YouTube] Add support for styles in attributed descriptions
Also refactor descriptions parsing.
2024-04-04 21:14:27 +02:00
TobiGr aaccfecda8 [YouTube] Detect new account termination messages 2024-03-20 14:57:41 +01:00
Stypox 5b59a1a8c5
[YouTube] Move meta info extraction to separate file
YoutubeParsingHelper was longer than 2000 lines which caused checkstyle issues
2023-12-21 21:19:08 +01:00
Stypox b8e12dd76c
[YouTube] Implement emergency meta info
YouTube provides that meta info panel when users search for really sensitive content like suicide (e.g. "blue whale").

It contains:
- an encouragement as title (e.g. "We are with you")
- a phone number as action
- details about how to call the phone number (e.g. availability)
- an url pointing to the website of an association

Also add a test that just checks if a meta info is properly extracted
2023-12-21 21:19:08 +01:00
AudricV ff8ed7247f
[YouTube] Switch to new consent cookie
Also move the documentation of the consent in its setter method in order to be
accessible publicly and improve it.
2023-12-08 21:46:46 +01:00
AudricV 2c941794c0
[YouTube] Add utcOffsetMinutes to all InnerTube payloads
This should make returned dates consistent between timezones and countries on
which the extractor is ran.

It was previously only set on YouTube Music search continuations.
2023-12-08 21:46:46 +01:00
AudricV d97c9e0db1
[YouTube] Improve payloads and URLs of InnerTube requests
For every InnerTube request:
- Always add a `request` object with the following properties:
  - "internalExperimentFlags" set to an empty array;
  - "useSsl" set to "true";
  - "lockedSafetyMode" set to "false".
- Use proper TODO comment to provide a way to enable restricted mode on every
request and add it on requests on which it wasn't present.

For YouTube Music:
- Remove alt query parameter, as it is not used anymore by the website;
- Add prettyPrint query parameter with false value on YouTube Music search
continuations.
2023-12-08 21:46:45 +01:00
AudricV 8a9ebcc373
[YouTube] Update InnerTube clients' version and devices' OS version and model 2023-12-08 21:46:45 +01:00
Christian fc67d49f59 Update copyright notices
Update copyright notices to comply to GPLv3 and change NewPipe to NewPipe Extractor on some notices that were not updated.
2023-09-22 19:10:15 -03:00
AudricV a04bc320de
[YouTube] Convert signature timestamp to integer
The signature timestamp is used as a number by HTML5 clients, so it should be
used in the same way by the extractor too instead of being a string.

As the timestamp doesn't seem to exceed 5 digits, an integer is used to store
its value.
2023-09-21 21:59:32 +02:00
AudricV adfad086ac
[YouTube] Add utility methods to get images from InfoItems and thumbnails arrays
Unmodifiable lists of Images are returned, parsed from a given YouTube
"thumbnails" JSON array.

These methods will be used in all YouTube extractors and InfoItems, as the
structures between content types (videos, channels, playlists, ...) are common.
2023-08-12 22:56:27 +02:00
AudricV 7366eab156
[YouTube] Add support for channel tabs and tags and age-restricted channels
Support of tags and videos, shorts, live, playlists and channels tabs has been
added for non-age restricted channels.

Age-restricted channels are now also supported and always returned the videos,
shorts and live tabs, accessible using system playlists. These tabs are the
only ones which can be accessed using YouTube's desktop website without being
logged-in.

The videos channel tab parameter has been updated to the one used by the
desktop website and when a channel extraction is fetched, this tab is returned
in the list of tabs as a cached one in the corresponding link handler.

Visitor data support per request has been added, as a valid visitor data is
required to fetch continuations with contents on the shorts tab. It is only
used in this case to enhance privacy.

A dedicated shorts UI elements (reelItemRenderers) extractor has been added,
YoutubeReelInfoItemExtractor. These elements do not provide the exact view
count, any uploader info (name, URL, avatar, verified status) and the upload
date.

All service's LinkHandlers are now using the singleton pattern and some code
has been also improved on the files changed.

Co-authored-by: ThetaDev <t.testboy@gmail.com>
Co-authored-by: Stypox <stypox@pm.me>
2023-08-06 12:15:04 +02:00
Kavin 25082d78b0
Replace SecureRandom with Random 2023-08-03 23:00:02 +01:00
ThetaDev 47aa9fed40 fix: set musicClientVersion regex capture group 2023-04-16 19:25:05 +02:00
ThetaDev 8d1303e18f
Add track types to audio streams (#1041) 2023-03-28 00:02:20 +02:00
AudricV 1556adbb2d
[YouTube] Fix hashtags links extraction and escape text in attribute descriptions + HTML links
webCommandMetadata object is contained inside a commandMetadata one, so it is
not accessible from the root of the navigationEndpoint object.

The corresponding statement has been moved at the bottom of the specific
endpoints parsing, as the webCommandMetadata object is present almost
everywhere, otherwise URLs of some endpoints would have be changed, such as
uploader URLs (from channel IDs to handles).

As no ParsingException is now thrown by getUrlFromNavigationEndpoint, and so by
getTextFromObject, getUrlFromObject and getTextAtKey, the methods which were
catching ParsingExceptions thrown by these methods had to be updated.

URLs got in the HTML version of getTextFromObject are now escaped properly to
provide valid HTML to clients. This has been also done for attribute
descriptions, with the description text for this type of descriptions.

As YouTube descriptions are in HTML format (except for the fallback on the JSON
player response, which is plain text and only happens when there is no visual
metadata or a breaking change), all URLs returned are escaped, so tests which
are testing presence of URLs with escaped characters had to be updated (it was
only the case for YoutubeStreamExtractorDefaultTest.DescriptionTestUnboxing).
2023-02-26 18:43:36 +01:00
TobiGr 3f7df9536e [YouTube] Fix getting the comment text if the comment contains a hashtag 2023-01-29 20:33:51 +01:00
Stypox 7293991832
[YouTube] Now music mixes can be treated as normal mixes
Using a playlist extractor on them would result in "Unviewable playlist" errors
2023-01-15 23:28:59 +01:00
TobiGr 56aab4d971 [YouTube] Fix escaping links in YouTubeParsingHelper.getTextFromObject 2023-01-05 00:28:12 +01:00
Stypox 45636b0d00
Merge pull request #986 from Isira-Seneviratne/Static_maps
Use immutable Map factory methods.
2023-01-02 18:11:14 +01:00
Stypox 219c5c5be5
Update extractor/src/main/java/org/schabi/newpipe/extractor/services/youtube/YoutubeParsingHelper.java 2023-01-02 18:11:03 +01:00
Isira Seneviratne d8ce08d969 Use immutable Map factory methods. 2023-01-02 07:50:31 +05:30
Kavin 01acf79436
Fix for potential XSS attacks. 2022-12-31 20:05:32 +00:00
AudricV d5437e0bc5
Merge pull request #863 from AudricV/add-content-type-and-content-length-headers-to-post-requests
Add Content-Type header to all POST requests without an empty body
2022-12-16 19:32:56 +01:00
Kavin 52fda37915
Implement bold/italic/strike-through support. 2022-11-28 19:06:18 +00:00
AudricV 3891542ca1
Use Downloader's postWithContentType and postWithContentTypeJson methods in services and extractors 2022-11-22 11:37:18 +01:00
AudricV e9a0d3bd95
[YouTube] Send Content-Type header in all POST requests
This header was not sent partially before and was added and guessed by OkHttp. This can create issues when using other HTTP clients than OkHttp, such as Cronet.

Some code in the modified classes has been improved and / or deduplicated, and usages of the UTF_8 constant of the Utils class has been replaced by StandardCharsets.UTF_8 where possible.

Note that this header has been not added in except in YoutubeDashManifestCreatorsUtils, as an empty body is sent in the POST requests made by this class.
2022-11-22 11:37:16 +01:00
Tobi 2211a24b69
Merge pull request #971 from lrusso96/patch-1
[YouTube] Improve duration parsing
2022-11-16 16:14:54 +01:00
Isira Seneviratne ddbce3b83d Add Utils methods for URL encoding/decoding using UTF-8. 2022-11-12 07:29:15 +05:30
Isira Seneviratne 366f5c1632 Use StandardCharsets.UTF_8. 2022-11-12 07:29:15 +05:30
Luigi Russo c9635218e2
[YouTube] Improve duration parsing 2022-11-09 09:41:29 +01:00
Isira Seneviratne 316d8573fa Use immutable sets in YoutubeParsingHelper. 2022-11-07 07:50:26 +05:30
ThetaDev 592e1d6386 fix: parsing attributed description with no command runs 2022-11-03 12:10:52 +01:00
ThetaDev 099b53cc4f
[YouTube] Add parser for attributedDescription
Also update the mock of the next InnerTube endpoint response of the
YoutubeStreamExtractorDefaultTest.DescriptionTestUnboxing test class with an
attributedDescription instead of a regular description
2022-11-02 23:11:33 +01:00
Kavin 6a256d0631
Add uploader url and verified to PlaylistInfoItem. 2022-10-30 13:00:19 +00:00
Isira Seneviratne 943b7c033b Remove EMPTY_STRING. 2022-08-24 06:59:17 +05:30
litetex 8ff7a90f52 Improved consent cookie related constants and documentation 2022-08-21 18:41:40 +02:00
litetex ecfc370685 Fixed all YTMixPlaylists
Added option to choose if you want to consent or not - currently this is done by a static variable in ``YoutubeParsingHelper`` - may not be the best long-term solution but for now the tests work again (in EU countries) 🥳
2022-08-14 14:48:27 +02:00
AudricV c82317e318
[YouTube] Spoof more mobile clients
Additional parameters have been added to the player requests of ANDROID and IOS
clients:

- for both clients: osName and osVersion: their respective values are:
  - for the ANDROID one: Android and 12;
  - for the IOS one: iOS and 15.6.0.19G71.
- for the ANDROID client: androidTargetSdkVersion, with the Android SDK version
  corresponding to the Android version used in the player requests of this
  client. This parameter is now required with this client to be sure to get a
  correct player response, otherwise, the one of a video saying that this
  content is not available in this app and to watch it with the latest version
  of YouTube can be returned instead;
- for the IOS client: deviceMake, with Apple as its value.

The iOS version sent in the IOS client player requests has been also updated to
the version 15.6 of the OS.

Finally, a comment about the requirement to use the signature timestamp from
the player JavaScript base file for HTML5 player requests on videos with
obfuscated URLs has been added and replaces a previous one which may be not
true.
2022-08-12 19:20:31 +02:00
AudricV d0549a5a52
[YouTube] Update client versions and use a real version for the iOS client
The iOS version can be got easily in fact, by looking at the What's New section of the App Store' app page.
2022-08-12 19:20:31 +02:00
AudricV d7e678aca2
[YouTube] Improve WEB client version and API key HTML extraction
Common code in WEB client version HTML extraction has been deduplicated, usage of the Java 8 Stream API has been made and initial data fallback has been used as a last resort.
This means that the client version extraction from regexes will be used before this fallback, as it doesn't contain the full client version.
This can be used as a way to fingerprint the extractor, even if it seems to be not the case.
2022-08-12 19:20:30 +02:00
Isira Seneviratne 1af6b8eedb Use Collections.singletonList(). 2022-07-27 07:35:57 +05:30
Isira Seneviratne ff60e05c76 Use Collections.singletonMap(). 2022-07-27 07:35:57 +05:30
TiA4f8R f17f7b9842
Apply requested changes in YoutubeParsingHelper 2022-05-28 12:00:55 +02:00
Stypox 50272db946
Apply reviews: improve comments, remove FILE, remove Stream#equals(Stream) 2022-05-28 12:00:49 +02:00
TiA4f8R aa4c10e751
Improve documentation and adress most of the requested changes
Also fix some issues in several places, in the code and the documentation.
2022-05-28 12:00:46 +02:00
TiA4f8R a857684442
Apply changes in YoutubeStreamExtractor
Extract post live DVR streams as post live streams instead of live streams.

A new class has been in order to improve code: ItagInfo, which stores an itag, the content (URL) extracted and if its an URL or not.
A functional interface has been added in order to abstract the stream building: StreamBuilderHelper.
Also add the cver parameter added by the desktop web client on the corresponding streams (a new method has been added in YoutubeParsingHelper to check this and another for Android streams).

Some code in these classes has been also refactored/improved/optimized.
2022-05-28 12:00:44 +02:00
TiA4f8R c34b5e3a8b
[YouTube] Fix extraction of YouTube Music client version and API key when using YouTube Music's website in EU
Google returns now the consent page of YouTube for YouTube Music in EU, which can be also avoided by adding the ucbcb parameter to the URL with the value 1 ("?ucbcb=1").
2022-05-15 11:20:06 +02:00