NewPipeExtractor

Commit Graph

Author	SHA1	Message	Date
AudricV	5f0faf34d7	[YouTube] Support playlists as URL navigation endpoints	2024-04-10 21:30:47 +02:00
AudricV	944d3723cd	[YouTube] Do not get twice runs array in YoutubeParsingHelper The runs object was computed twice in getTextFromObject and getUrlFromObject methods, leading to unneeded search costs. This has been avoided by storing the array in method variables.	2024-04-10 21:30:46 +02:00
Stypox	09732d6785	[YouTube] Add support for styles in attributed descriptions Also refactor descriptions parsing.	2024-04-04 21:14:27 +02:00
TobiGr	aaccfecda8	[YouTube] Detect new account termination messages	2024-03-20 14:57:41 +01:00
Stypox	5b59a1a8c5	[YouTube] Move meta info extraction to separate file YoutubeParsingHelper was longer than 2000 lines which caused checkstyle issues	2023-12-21 21:19:08 +01:00
Stypox	b8e12dd76c	[YouTube] Implement emergency meta info YouTube provides that meta info panel when users search for really sensitive content like suicide (e.g. "blue whale"). It contains: - an encouragement as title (e.g. "We are with you") - a phone number as action - details about how to call the phone number (e.g. availability) - an url pointing to the website of an association Also add a test that just checks if a meta info is properly extracted	2023-12-21 21:19:08 +01:00
AudricV	ff8ed7247f	[YouTube] Switch to new consent cookie Also move the documentation of the consent in its setter method in order to be accessible publicly and improve it.	2023-12-08 21:46:46 +01:00
AudricV	2c941794c0	[YouTube] Add utcOffsetMinutes to all InnerTube payloads This should make returned dates consistent between timezones and countries on which the extractor is ran. It was previously only set on YouTube Music search continuations.	2023-12-08 21:46:46 +01:00
AudricV	d97c9e0db1	[YouTube] Improve payloads and URLs of InnerTube requests For every InnerTube request: - Always add a `request` object with the following properties: - "internalExperimentFlags" set to an empty array; - "useSsl" set to "true"; - "lockedSafetyMode" set to "false". - Use proper TODO comment to provide a way to enable restricted mode on every request and add it on requests on which it wasn't present. For YouTube Music: - Remove alt query parameter, as it is not used anymore by the website; - Add prettyPrint query parameter with false value on YouTube Music search continuations.	2023-12-08 21:46:45 +01:00
AudricV	8a9ebcc373	[YouTube] Update InnerTube clients' version and devices' OS version and model	2023-12-08 21:46:45 +01:00
Christian	fc67d49f59	Update copyright notices Update copyright notices to comply to GPLv3 and change NewPipe to NewPipe Extractor on some notices that were not updated.	2023-09-22 19:10:15 -03:00
AudricV	a04bc320de	[YouTube] Convert signature timestamp to integer The signature timestamp is used as a number by HTML5 clients, so it should be used in the same way by the extractor too instead of being a string. As the timestamp doesn't seem to exceed 5 digits, an integer is used to store its value.	2023-09-21 21:59:32 +02:00
AudricV	adfad086ac	[YouTube] Add utility methods to get images from InfoItems and thumbnails arrays Unmodifiable lists of Images are returned, parsed from a given YouTube "thumbnails" JSON array. These methods will be used in all YouTube extractors and InfoItems, as the structures between content types (videos, channels, playlists, ...) are common.	2023-08-12 22:56:27 +02:00
AudricV	7366eab156	[YouTube] Add support for channel tabs and tags and age-restricted channels Support of tags and videos, shorts, live, playlists and channels tabs has been added for non-age restricted channels. Age-restricted channels are now also supported and always returned the videos, shorts and live tabs, accessible using system playlists. These tabs are the only ones which can be accessed using YouTube's desktop website without being logged-in. The videos channel tab parameter has been updated to the one used by the desktop website and when a channel extraction is fetched, this tab is returned in the list of tabs as a cached one in the corresponding link handler. Visitor data support per request has been added, as a valid visitor data is required to fetch continuations with contents on the shorts tab. It is only used in this case to enhance privacy. A dedicated shorts UI elements (reelItemRenderers) extractor has been added, YoutubeReelInfoItemExtractor. These elements do not provide the exact view count, any uploader info (name, URL, avatar, verified status) and the upload date. All service's LinkHandlers are now using the singleton pattern and some code has been also improved on the files changed. Co-authored-by: ThetaDev <t.testboy@gmail.com> Co-authored-by: Stypox <stypox@pm.me>	2023-08-06 12:15:04 +02:00
Kavin	25082d78b0	Replace SecureRandom with Random	2023-08-03 23:00:02 +01:00
ThetaDev	47aa9fed40	fix: set musicClientVersion regex capture group	2023-04-16 19:25:05 +02:00
ThetaDev	8d1303e18f	Add track types to audio streams (#1041 )	2023-03-28 00:02:20 +02:00
AudricV	1556adbb2d	[YouTube] Fix hashtags links extraction and escape text in attribute descriptions + HTML links webCommandMetadata object is contained inside a commandMetadata one, so it is not accessible from the root of the navigationEndpoint object. The corresponding statement has been moved at the bottom of the specific endpoints parsing, as the webCommandMetadata object is present almost everywhere, otherwise URLs of some endpoints would have be changed, such as uploader URLs (from channel IDs to handles). As no ParsingException is now thrown by getUrlFromNavigationEndpoint, and so by getTextFromObject, getUrlFromObject and getTextAtKey, the methods which were catching ParsingExceptions thrown by these methods had to be updated. URLs got in the HTML version of getTextFromObject are now escaped properly to provide valid HTML to clients. This has been also done for attribute descriptions, with the description text for this type of descriptions. As YouTube descriptions are in HTML format (except for the fallback on the JSON player response, which is plain text and only happens when there is no visual metadata or a breaking change), all URLs returned are escaped, so tests which are testing presence of URLs with escaped characters had to be updated (it was only the case for YoutubeStreamExtractorDefaultTest.DescriptionTestUnboxing).	2023-02-26 18:43:36 +01:00
TobiGr	3f7df9536e	[YouTube] Fix getting the comment text if the comment contains a hashtag	2023-01-29 20:33:51 +01:00
Stypox	7293991832	[YouTube] Now music mixes can be treated as normal mixes Using a playlist extractor on them would result in "Unviewable playlist" errors	2023-01-15 23:28:59 +01:00
TobiGr	56aab4d971	[YouTube] Fix escaping links in YouTubeParsingHelper.getTextFromObject	2023-01-05 00:28:12 +01:00
Stypox	45636b0d00	Merge pull request #986 from Isira-Seneviratne/Static_maps Use immutable Map factory methods.	2023-01-02 18:11:14 +01:00
Stypox	219c5c5be5	Update extractor/src/main/java/org/schabi/newpipe/extractor/services/youtube/YoutubeParsingHelper.java	2023-01-02 18:11:03 +01:00
Isira Seneviratne	d8ce08d969	Use immutable Map factory methods.	2023-01-02 07:50:31 +05:30
Kavin	01acf79436	Fix for potential XSS attacks.	2022-12-31 20:05:32 +00:00
AudricV	d5437e0bc5	Merge pull request #863 from AudricV/add-content-type-and-content-length-headers-to-post-requests Add Content-Type header to all POST requests without an empty body	2022-12-16 19:32:56 +01:00
Kavin	52fda37915	Implement bold/italic/strike-through support.	2022-11-28 19:06:18 +00:00
AudricV	3891542ca1	Use Downloader's postWithContentType and postWithContentTypeJson methods in services and extractors	2022-11-22 11:37:18 +01:00
AudricV	e9a0d3bd95	[YouTube] Send Content-Type header in all POST requests This header was not sent partially before and was added and guessed by OkHttp. This can create issues when using other HTTP clients than OkHttp, such as Cronet. Some code in the modified classes has been improved and / or deduplicated, and usages of the UTF_8 constant of the Utils class has been replaced by StandardCharsets.UTF_8 where possible. Note that this header has been not added in except in YoutubeDashManifestCreatorsUtils, as an empty body is sent in the POST requests made by this class.	2022-11-22 11:37:16 +01:00
Tobi	2211a24b69	Merge pull request #971 from lrusso96/patch-1 [YouTube] Improve duration parsing	2022-11-16 16:14:54 +01:00
Isira Seneviratne	ddbce3b83d	Add Utils methods for URL encoding/decoding using UTF-8.	2022-11-12 07:29:15 +05:30
Isira Seneviratne	366f5c1632	Use StandardCharsets.UTF_8.	2022-11-12 07:29:15 +05:30
Luigi Russo	c9635218e2	[YouTube] Improve duration parsing	2022-11-09 09:41:29 +01:00
Isira Seneviratne	316d8573fa	Use immutable sets in YoutubeParsingHelper.	2022-11-07 07:50:26 +05:30
ThetaDev	592e1d6386	fix: parsing attributed description with no command runs	2022-11-03 12:10:52 +01:00
ThetaDev	099b53cc4f	[YouTube] Add parser for attributedDescription Also update the mock of the next InnerTube endpoint response of the YoutubeStreamExtractorDefaultTest.DescriptionTestUnboxing test class with an attributedDescription instead of a regular description	2022-11-02 23:11:33 +01:00
Kavin	6a256d0631	Add uploader url and verified to PlaylistInfoItem.	2022-10-30 13:00:19 +00:00
Isira Seneviratne	943b7c033b	Remove EMPTY_STRING.	2022-08-24 06:59:17 +05:30
litetex	8ff7a90f52	Improved consent cookie related constants and documentation	2022-08-21 18:41:40 +02:00
litetex	ecfc370685	Fixed all YTMixPlaylists Added option to choose if you want to consent or not - currently this is done by a static variable in ``YoutubeParsingHelper`` - may not be the best long-term solution but for now the tests work again (in EU countries) 🥳	2022-08-14 14:48:27 +02:00
AudricV	c82317e318	[YouTube] Spoof more mobile clients Additional parameters have been added to the player requests of ANDROID and IOS clients: - for both clients: osName and osVersion: their respective values are: - for the ANDROID one: Android and 12; - for the IOS one: iOS and 15.6.0.19G71. - for the ANDROID client: androidTargetSdkVersion, with the Android SDK version corresponding to the Android version used in the player requests of this client. This parameter is now required with this client to be sure to get a correct player response, otherwise, the one of a video saying that this content is not available in this app and to watch it with the latest version of YouTube can be returned instead; - for the IOS client: deviceMake, with Apple as its value. The iOS version sent in the IOS client player requests has been also updated to the version 15.6 of the OS. Finally, a comment about the requirement to use the signature timestamp from the player JavaScript base file for HTML5 player requests on videos with obfuscated URLs has been added and replaces a previous one which may be not true.	2022-08-12 19:20:31 +02:00
AudricV	d0549a5a52	[YouTube] Update client versions and use a real version for the iOS client The iOS version can be got easily in fact, by looking at the What's New section of the App Store' app page.	2022-08-12 19:20:31 +02:00
AudricV	d7e678aca2	[YouTube] Improve WEB client version and API key HTML extraction Common code in WEB client version HTML extraction has been deduplicated, usage of the Java 8 Stream API has been made and initial data fallback has been used as a last resort. This means that the client version extraction from regexes will be used before this fallback, as it doesn't contain the full client version. This can be used as a way to fingerprint the extractor, even if it seems to be not the case.	2022-08-12 19:20:30 +02:00
Isira Seneviratne	1af6b8eedb	Use Collections.singletonList().	2022-07-27 07:35:57 +05:30
Isira Seneviratne	ff60e05c76	Use Collections.singletonMap().	2022-07-27 07:35:57 +05:30
TiA4f8R	f17f7b9842	Apply requested changes in YoutubeParsingHelper	2022-05-28 12:00:55 +02:00
Stypox	50272db946	Apply reviews: improve comments, remove FILE, remove Stream#equals(Stream)	2022-05-28 12:00:49 +02:00
TiA4f8R	aa4c10e751	Improve documentation and adress most of the requested changes Also fix some issues in several places, in the code and the documentation.	2022-05-28 12:00:46 +02:00
TiA4f8R	a857684442	Apply changes in YoutubeStreamExtractor Extract post live DVR streams as post live streams instead of live streams. A new class has been in order to improve code: ItagInfo, which stores an itag, the content (URL) extracted and if its an URL or not. A functional interface has been added in order to abstract the stream building: StreamBuilderHelper. Also add the cver parameter added by the desktop web client on the corresponding streams (a new method has been added in YoutubeParsingHelper to check this and another for Android streams). Some code in these classes has been also refactored/improved/optimized.	2022-05-28 12:00:44 +02:00
TiA4f8R	c34b5e3a8b	[YouTube] Fix extraction of YouTube Music client version and API key when using YouTube Music's website in EU Google returns now the consent page of YouTube for YouTube Music in EU, which can be also avoided by adding the ucbcb parameter to the URL with the value 1 ("?ucbcb=1").	2022-05-15 11:20:06 +02:00

1 2 3 4

158 Commits