mirror of
https://github.com/l1ving/youtube-dl
synced 2025-02-03 16:42:52 +08:00
Merge branch 'master' of github.com:rg3/youtube-dl
This commit is contained in:
commit
979dd84f48
8
.github/ISSUE_TEMPLATE.md
vendored
8
.github/ISSUE_TEMPLATE.md
vendored
@ -6,8 +6,8 @@
|
||||
|
||||
---
|
||||
|
||||
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2016.12.15*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
|
||||
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2016.12.15**
|
||||
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2017.03.07*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
|
||||
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2017.03.07**
|
||||
|
||||
### Before submitting an *issue* make sure you have:
|
||||
- [ ] At least skimmed through [README](https://github.com/rg3/youtube-dl/blob/master/README.md) and **most notably** [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
|
||||
@ -35,7 +35,7 @@ $ youtube-dl -v <your command line>
|
||||
[debug] User config: []
|
||||
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
|
||||
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
|
||||
[debug] youtube-dl version 2016.12.15
|
||||
[debug] youtube-dl version 2017.03.07
|
||||
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
|
||||
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
|
||||
[debug] Proxy map: {}
|
||||
@ -50,6 +50,8 @@ $ youtube-dl -v <your command line>
|
||||
- Single video: https://youtu.be/BaW_jenozKc
|
||||
- Playlist: https://www.youtube.com/playlist?list=PL4lCao7KL_QFVb7Iudeipvc2BCavECqzc
|
||||
|
||||
Note that **youtube-dl does not support sites dedicated to [copyright infringement](https://github.com/rg3/youtube-dl#can-you-add-support-for-this-anime-video-site-or-site-which-shows-current-movies-for-free)**. In order for site support request to be accepted all provided example URLs should not violate any copyrights.
|
||||
|
||||
---
|
||||
|
||||
### Description of your *issue*, suggested solution and other information
|
||||
|
2
.github/ISSUE_TEMPLATE_tmpl.md
vendored
2
.github/ISSUE_TEMPLATE_tmpl.md
vendored
@ -50,6 +50,8 @@ $ youtube-dl -v <your command line>
|
||||
- Single video: https://youtu.be/BaW_jenozKc
|
||||
- Playlist: https://www.youtube.com/playlist?list=PL4lCao7KL_QFVb7Iudeipvc2BCavECqzc
|
||||
|
||||
Note that **youtube-dl does not support sites dedicated to [copyright infringement](https://github.com/rg3/youtube-dl#can-you-add-support-for-this-anime-video-site-or-site-which-shows-current-movies-for-free)**. In order for site support request to be accepted all provided example URLs should not violate any copyrights.
|
||||
|
||||
---
|
||||
|
||||
### Description of your *issue*, suggested solution and other information
|
||||
|
@ -6,8 +6,12 @@ python:
|
||||
- "3.3"
|
||||
- "3.4"
|
||||
- "3.5"
|
||||
- "3.6"
|
||||
sudo: false
|
||||
script: nosetests test --verbose
|
||||
env:
|
||||
- YTDL_TEST_SET=core
|
||||
- YTDL_TEST_SET=download
|
||||
script: ./devscripts/run_tests.sh
|
||||
notifications:
|
||||
email:
|
||||
- filippo.valsorda@gmail.com
|
||||
|
19
AUTHORS
19
AUTHORS
@ -190,3 +190,22 @@ John Hawkinson
|
||||
Rich Leeper
|
||||
Zhong Jianxin
|
||||
Thor77
|
||||
Mattias Wadman
|
||||
Arjan Verwer
|
||||
Costy Petrisor
|
||||
Logan B
|
||||
Alex Seiler
|
||||
Vijay Singh
|
||||
Paul Hartmann
|
||||
Stephen Chen
|
||||
Fabian Stahl
|
||||
Bagira
|
||||
Odd Stråbø
|
||||
Philip Herzog
|
||||
Thomas Christlieb
|
||||
Marek Rusinowski
|
||||
Tobias Gruetzmacher
|
||||
Olivier Bilodeau
|
||||
Lars Vierbergen
|
||||
Juanjo Benages
|
||||
Xiao Di Guan
|
||||
|
@ -58,7 +58,7 @@ We are then presented with a very complicated request when the original problem
|
||||
|
||||
Some of our users seem to think there is a limit of issues they can or should open. There is no limit of issues they can or should open. While it may seem appealing to be able to dump all your issues into one ticket, that means that someone who solves one of your issues cannot mark the issue as closed. Typically, reporting a bunch of issues leads to the ticket lingering since nobody wants to attack that behemoth, until someone mercifully splits the issue into multiple ones.
|
||||
|
||||
In particular, every site support request issue should only pertain to services at one site (generally under a common domain, but always using the same backend technology). Do not request support for vimeo user videos, Whitehouse podcasts, and Google Plus pages in the same issue. Also, make sure that you don't post bug reports alongside feature requests. As a rule of thumb, a feature request does not include outputs of youtube-dl that are not immediately related to the feature at hand. Do not post reports of a network error alongside the request for a new video service.
|
||||
In particular, every site support request issue should only pertain to services at one site (generally under a common domain, but always using the same backend technology). Do not request support for vimeo user videos, White house podcasts, and Google Plus pages in the same issue. Also, make sure that you don't post bug reports alongside feature requests. As a rule of thumb, a feature request does not include outputs of youtube-dl that are not immediately related to the feature at hand. Do not post reports of a network error alongside the request for a new video service.
|
||||
|
||||
### Is anyone going to need the feature?
|
||||
|
||||
@ -94,7 +94,7 @@ If you want to create a build of youtube-dl yourself, you'll need
|
||||
|
||||
If you want to add support for a new site, first of all **make sure** this site is **not dedicated to [copyright infringement](README.md#can-you-add-support-for-this-anime-video-site-or-site-which-shows-current-movies-for-free)**. youtube-dl does **not support** such sites thus pull requests adding support for them **will be rejected**.
|
||||
|
||||
After you have ensured this site is distributing it's content legally, you can follow this quick list (assuming your service is called `yourextractor`):
|
||||
After you have ensured this site is distributing its content legally, you can follow this quick list (assuming your service is called `yourextractor`):
|
||||
|
||||
1. [Fork this repository](https://github.com/rg3/youtube-dl/fork)
|
||||
2. Check out the source code with:
|
||||
@ -124,7 +124,7 @@ After you have ensured this site is distributing it's content legally, you can f
|
||||
'id': '42',
|
||||
'ext': 'mp4',
|
||||
'title': 'Video title goes here',
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
# TODO more properties, either as:
|
||||
# * A value
|
||||
# * MD5 checksum; start the string with md5:
|
||||
@ -199,7 +199,7 @@ Assume at this point `meta`'s layout is:
|
||||
}
|
||||
```
|
||||
|
||||
Assume you want to extract `summary` and put it into the resulting info dict as `description`. Since `description` is an optional metafield you should be ready that this key may be missing from the `meta` dict, so that you should extract it like:
|
||||
Assume you want to extract `summary` and put it into the resulting info dict as `description`. Since `description` is an optional meta field you should be ready that this key may be missing from the `meta` dict, so that you should extract it like:
|
||||
|
||||
```python
|
||||
description = meta.get('summary') # correct
|
||||
|
605
ChangeLog
605
ChangeLog
@ -1,3 +1,608 @@
|
||||
version <unreleased>
|
||||
|
||||
Extractors
|
||||
* [miomio] Fix extraction (#12291, #12388, #12402)
|
||||
|
||||
|
||||
version 2017.03.07
|
||||
|
||||
Core
|
||||
* Metadata are now added after conversion (#5594)
|
||||
|
||||
Extractors
|
||||
* [soundcloud] Update client id (#12376)
|
||||
* [openload] Fix extraction (#10408, #12357)
|
||||
|
||||
|
||||
version 2017.03.06
|
||||
|
||||
Core
|
||||
+ [utils] Process bytestrings in urljoin (#12369)
|
||||
* [extractor/common] Improve height extraction and extract bitrate
|
||||
* [extractor/common] Move jwplayer formats extraction in separate method
|
||||
+ [external:ffmpeg] Limit test download size to 10KiB (#12362)
|
||||
|
||||
Extractors
|
||||
+ [drtv] Add geo countries to GeoRestrictedError
|
||||
+ [drtv:live] Bypass geo restriction
|
||||
+ [tunepk] Add extractor (#12197, #12243)
|
||||
|
||||
|
||||
version 2017.03.05
|
||||
|
||||
Extractors
|
||||
+ [twitch] Add basic support for two-factor authentication (#11974)
|
||||
+ [vier] Add support for vijf.be (#12304)
|
||||
+ [redbulltv] Add support for redbull.tv (#3919, #11948)
|
||||
* [douyutv] Switch to the PC API to escape the 5-min limitation (#12316)
|
||||
+ [generic] Add support for rutube embeds
|
||||
+ [rutube] Relax URL regular expression
|
||||
+ [vrak] Add support for vrak.tv (#11452)
|
||||
+ [brightcove:new] Add ability to smuggle geo_countries into URL
|
||||
+ [brightcove:new] Raise GeoRestrictedError
|
||||
* [go] Relax URL regular expression (#12341)
|
||||
* [24video] Use original host for requests (#12339)
|
||||
* [ruutu] Disable DASH formats (#12322)
|
||||
|
||||
|
||||
version 2017.03.02
|
||||
|
||||
Core
|
||||
+ [adobepass] Add support for Charter Spectrum (#11465)
|
||||
* [YoutubeDL] Don't sanitize identifiers in output template (#12317)
|
||||
|
||||
Extractors
|
||||
* [facebook] Fix extraction (#12323, #12330)
|
||||
* [youtube] Mark errors about rental videos as expected (#12324)
|
||||
+ [npo] Add support for audio
|
||||
* [npo] Adapt to app.php API (#12311, #12320)
|
||||
|
||||
|
||||
version 2017.02.28
|
||||
|
||||
Core
|
||||
+ [utils] Add bytes_to_long and long_to_bytes
|
||||
+ [utils] Add pkcs1pad
|
||||
+ [aes] Add aes_cbc_encrypt
|
||||
|
||||
Extractors
|
||||
+ [azmedien:showplaylist] Add support for show playlists (#12160)
|
||||
+ [youtube:playlist] Recognize another playlist pattern (#11928, #12286)
|
||||
+ [daisuki] Add support for daisuki.net (#2486, #3186, #4738, #6175, #7776,
|
||||
#10060)
|
||||
* [douyu] Fix extraction (#12301)
|
||||
|
||||
|
||||
version 2017.02.27
|
||||
|
||||
Core
|
||||
* [downloader/common] Limit displaying 2 digits after decimal point in sleep
|
||||
interval message (#12183)
|
||||
+ [extractor/common] Add preference to _parse_html5_media_entries
|
||||
|
||||
Extractors
|
||||
+ [npo] Add support for zapp.nl
|
||||
+ [npo] Add support for hetklokhuis.nl (#12293)
|
||||
- [scivee] Remove extractor (#9315)
|
||||
+ [cda] Decode download URL (#12255)
|
||||
+ [crunchyroll] Improve uploader extraction (#12267)
|
||||
+ [youtube] Raise GeoRestrictedError
|
||||
+ [dailymotion] Raise GeoRestrictedError
|
||||
+ [mdr] Recognize more URL patterns (#12169)
|
||||
+ [tvigle] Raise GeoRestrictedError
|
||||
* [vevo] Fix extraction for videos with the new streams/streamsV3 format
|
||||
(#11719)
|
||||
+ [freshlive] Add support for freshlive.tv (#12175)
|
||||
+ [xhamster] Capture and output videoClosed error (#12263)
|
||||
+ [etonline] Add support for etonline.com (#12236)
|
||||
+ [njpwworld] Add support for njpwworld.com (#11561)
|
||||
* [amcnetworks] Relax URL regular expression (#12127)
|
||||
|
||||
|
||||
version 2017.02.24.1
|
||||
|
||||
Extractors
|
||||
* [noco] Modernize
|
||||
* [noco] Switch login URL to https (#12246)
|
||||
+ [thescene] Extract more metadata
|
||||
* [thescene] Fix extraction (#12235)
|
||||
+ [tubitv] Use geo bypass mechanism
|
||||
* [openload] Fix extraction (#10408)
|
||||
+ [ivi] Raise GeoRestrictedError
|
||||
|
||||
|
||||
version 2017.02.24
|
||||
|
||||
Core
|
||||
* [options] Hide deprecated options from --help
|
||||
* [options] Deprecate --autonumber-size
|
||||
+ [YoutubeDL] Add support for string formatting operations in output template
|
||||
(#5185, #5748, #6841, #9929, #9966 #9978, #12189)
|
||||
|
||||
Extractors
|
||||
+ [lynda:course] Add webpage extraction fallback (#12238)
|
||||
* [go] Sign all uplynk URLs and use geo bypass only for free videos
|
||||
(#12087, #12210)
|
||||
+ [skylinewebcams] Add support for skylinewebcams.com (#12221)
|
||||
+ [instagram] Add support for multi video posts (#12226)
|
||||
+ [crunchyroll] Extract playlist entries ids
|
||||
* [mgtv] Fix extraction
|
||||
+ [sohu] Raise GeoRestrictedError
|
||||
+ [leeco] Raise GeoRestrictedError and use geo bypass mechanism
|
||||
|
||||
|
||||
version 2017.02.22
|
||||
|
||||
Extractors
|
||||
* [crunchyroll] Fix descriptions with double quotes (#12124)
|
||||
* [dailymotion] Make comment count optional (#12209)
|
||||
+ [vidzi] Add support for vidzi.cc (#12213)
|
||||
+ [24video] Add support for 24video.tube (#12217)
|
||||
+ [crackle] Use geo bypass mechanism
|
||||
+ [viewster] Use geo verification headers
|
||||
+ [tfo] Improve geo restriction detection and use geo bypass mechanism
|
||||
+ [telequebec] Use geo bypass mechanism
|
||||
+ [limelight] Extract PlaylistService errors and improve geo restriction
|
||||
detection
|
||||
|
||||
|
||||
version 2017.02.21
|
||||
|
||||
Core
|
||||
* [extractor/common] Allow calling _initialize_geo_bypass from extractors
|
||||
(#11970)
|
||||
+ [adobepass] Add support for Time Warner Cable (#12191)
|
||||
+ [travis] Run tests in parallel
|
||||
+ [downloader/ism] Honor HTTP headers when downloading fragments
|
||||
+ [downloader/dash] Honor HTTP headers when downloading fragments
|
||||
+ [utils] Add GeoUtils class for working with geo tools and GeoUtils.random_ipv4
|
||||
+ Add option --geo-bypass-country for explicit geo bypass on behalf of
|
||||
specified country
|
||||
+ Add options to control geo bypass mechanism --geo-bypass and --no-geo-bypass
|
||||
+ Add experimental geo restriction bypass mechanism based on faking
|
||||
X-Forwarded-For HTTP header
|
||||
+ [utils] Introduce GeoRestrictedError for geo restricted videos
|
||||
+ [utils] Introduce YoutubeDLError base class for all youtube-dl exceptions
|
||||
|
||||
Extractors
|
||||
+ [ninecninemedia] Use geo bypass mechanism
|
||||
* [spankbang] Make uploader optional (#12193)
|
||||
+ [iprima] Improve geo restriction detection and disable geo bypass
|
||||
* [iprima] Modernize
|
||||
* [commonmistakes] Disable UnicodeBOM extractor test for python 3.2
|
||||
+ [prosiebensat1] Throw ExtractionError on unsupported page type (#12180)
|
||||
* [nrk] Update _API_HOST and relax _VALID_URL
|
||||
+ [tv4] Bypass geo restriction and improve detection
|
||||
* [tv4] Switch to hls3 protocol (#12177)
|
||||
+ [viki] Improve geo restriction detection
|
||||
+ [vgtv] Improve geo restriction detection
|
||||
+ [srgssr] Improve geo restriction detection
|
||||
+ [vbox7] Improve geo restriction detection and use geo bypass mechanism
|
||||
+ [svt] Improve geo restriction detection and use geo bypass mechanism
|
||||
+ [pbs] Improve geo restriction detection and use geo bypass mechanism
|
||||
+ [ondemandkorea] Improve geo restriction detection and use geo bypass mechanism
|
||||
+ [nrk] Improve geo restriction detection and use geo bypass mechanism
|
||||
+ [itv] Improve geo restriction detection and use geo bypass mechanism
|
||||
+ [go] Improve geo restriction detection and use geo bypass mechanism
|
||||
+ [dramafever] Improve geo restriction detection and use geo bypass mechanism
|
||||
* [brightcove:legacy] Restrict videoPlayer value (#12040)
|
||||
+ [tvn24] Add support for tvn24.pl and tvn24bis.pl (#11679)
|
||||
+ [thisav] Add support for HTML5 media (#11771)
|
||||
* [metacafe] Bypass family filter (#10371)
|
||||
* [viceland] Improve info extraction
|
||||
|
||||
|
||||
version 2017.02.17
|
||||
|
||||
Extractors
|
||||
* [heise] Improve extraction (#9725)
|
||||
* [ellentv] Improve (#11653)
|
||||
* [openload] Fix extraction (#10408, #12002)
|
||||
+ [theplatform] Recognize URLs with whitespaces (#12044)
|
||||
* [einthusan] Relax URL regular expression (#12141, #12159)
|
||||
+ [generic] Support complex JWPlayer embedded videos (#12030)
|
||||
* [elpais] Improve extraction (#12139)
|
||||
|
||||
|
||||
version 2017.02.16
|
||||
|
||||
Core
|
||||
+ [utils] Add support for quoted string literals in --match-filter (#8050,
|
||||
#12142, #12144)
|
||||
|
||||
Extractors
|
||||
* [ceskatelevize] Lower priority for audio description sources (#12119)
|
||||
* [amcnetworks] Fix extraction (#12127)
|
||||
* [pinkbike] Fix uploader extraction (#12054)
|
||||
+ [onetpl] Add support for businessinsider.com.pl and plejada.pl
|
||||
+ [onetpl] Add support for onet.pl (#10507)
|
||||
+ [onetmvp] Add shortcut extractor
|
||||
+ [vodpl] Add support for vod.pl (#12122)
|
||||
+ [pornhub] Extract video URL from tv platform site (#12007, #12129)
|
||||
+ [ceskatelevize] Extract DASH formats (#12119, #12133)
|
||||
|
||||
|
||||
version 2017.02.14
|
||||
|
||||
Core
|
||||
* TypeError is fixed with Python 2.7.13 on Windows (#11540, #12085)
|
||||
|
||||
Extractor
|
||||
* [zdf] Fix extraction (#12117)
|
||||
* [xtube] Fix extraction for both kinds of video id (#12088)
|
||||
* [xtube] Improve title extraction (#12088)
|
||||
+ [lemonde] Fallback delegate extraction to generic extractor (#12115, #12116)
|
||||
* [bellmedia] Allow video id longer than 6 characters (#12114)
|
||||
+ [limelight] Add support for referer protected videos
|
||||
* [disney] Improve extraction (#4975, #11000, #11882, #11936)
|
||||
* [hotstar] Improve extraction (#12096)
|
||||
* [einthusan] Fix extraction (#11416)
|
||||
+ [aenetworks] Add support for lifetimemovieclub.com (#12097)
|
||||
* [youtube] Fix parsing codecs (#12091)
|
||||
|
||||
|
||||
version 2017.02.11
|
||||
|
||||
Core
|
||||
+ [utils] Introduce get_elements_by_class and get_elements_by_attribute
|
||||
utility functions
|
||||
+ [extractor/common] Skip m3u8 manifests protected with Adobe Flash Access
|
||||
|
||||
Extractor
|
||||
* [pluralsight:course] Fix extraction (#12075)
|
||||
+ [bbc] Extract m3u8 formats with 320k audio
|
||||
* [facebook] Relax video id matching (#11017, #12055, #12056)
|
||||
+ [corus] Add support for Corus Entertainment sites (#12060, #9164)
|
||||
+ [pluralsight] Detect blocked account error message (#12070)
|
||||
+ [bloomberg] Add another video id pattern (#12062)
|
||||
* [extractor/commonmistakes] Restrict URL regular expression (#12050)
|
||||
+ [tvplayer] Add support for tvplayer.com
|
||||
|
||||
|
||||
version 2017.02.10
|
||||
|
||||
Extractors
|
||||
* [xtube] Fix extraction (#12023)
|
||||
* [pornhub] Fix extraction (#12007, #12018)
|
||||
* [facebook] Improve JS data regular expression (#12042)
|
||||
* [kaltura] Improve embed partner id extraction (#12041)
|
||||
+ [sprout] Add support for sproutonline.com
|
||||
* [6play] Improve extraction
|
||||
+ [scrippsnetworks:watch] Add support for Scripps Networks sites (#10765)
|
||||
+ [go] Add support for Adobe Pass authentication (#11468, #10831)
|
||||
* [6play] Fix extraction (#12011)
|
||||
+ [nbc] Add support for Adobe Pass authentication (#12006)
|
||||
|
||||
|
||||
version 2017.02.07
|
||||
|
||||
Core
|
||||
* [extractor/common] Fix audio only with audio group in m3u8 (#11995)
|
||||
+ [downloader/fragment] Respect --no-part
|
||||
* [extractor/common] Speed-up HTML5 media entries extraction (#11979)
|
||||
|
||||
Extractors
|
||||
* [pornhub] Fix extraction (#11997)
|
||||
+ [canalplus] Add support for cstar.fr (#11990)
|
||||
+ [extractor/generic] Improve RTMP support (#11993)
|
||||
+ [gaskrank] Add support for gaskrank.tv (#11685)
|
||||
* [bandcamp] Fix extraction for incomplete albums (#11727)
|
||||
* [iwara] Fix extraction (#11781)
|
||||
* [googledrive] Fix extraction on Python 3.6
|
||||
+ [videopress] Add support for videopress.com
|
||||
+ [afreecatv] Extract RTMP formats
|
||||
|
||||
|
||||
version 2017.02.04.1
|
||||
|
||||
Extractors
|
||||
+ [twitch:stream] Add support for player.twitch.tv (#11971)
|
||||
* [radiocanada] Fix extraction for toutv rtmp formats
|
||||
|
||||
|
||||
version 2017.02.04
|
||||
|
||||
Core
|
||||
+ Add --playlist-random to shuffle playlists (#11889, #11901)
|
||||
* [utils] Improve comments processing in js_to_json (#11947)
|
||||
* [utils] Handle single-line comments in js_to_json
|
||||
* [downloader/external:ffmpeg] Minimize the use of aac_adtstoasc filter
|
||||
|
||||
Extractors
|
||||
+ [piksel] Add another app token pattern (#11969)
|
||||
+ [vk] Capture and output author blocked error message (#11965)
|
||||
+ [turner] Fix secure HLS formats downloading with ffmpeg (#11358, #11373,
|
||||
#11800)
|
||||
+ [drtv] Add support for live and radio sections (#1827, #3427)
|
||||
* [myspace] Fix extraction and extract HLS and HTTP formats
|
||||
+ [youtube] Add format info for itag 325 and 328
|
||||
* [vine] Fix extraction (#11955)
|
||||
- [sportbox] Remove extractor (#11954)
|
||||
+ [filmon] Add support for filmon.com (#11187)
|
||||
+ [infoq] Add audio only formats (#11565)
|
||||
* [douyutv] Improve room id regular expression (#11931)
|
||||
* [iprima] Fix extraction (#11920, #11896)
|
||||
* [youtube] Fix ytsearch when cookies are provided (#11924)
|
||||
* [go] Relax video id regular expression (#11937)
|
||||
* [facebook] Fix title extraction (#11941)
|
||||
+ [youtube:playlist] Recognize TL playlists (#11945)
|
||||
+ [bilibili] Support new Bangumi URLs (#11845)
|
||||
+ [cbc:watch] Extract audio codec for audio only formats (#11893)
|
||||
+ [elpais] Fix extraction for some URLs (#11765)
|
||||
|
||||
|
||||
version 2017.02.01
|
||||
|
||||
Extractors
|
||||
+ [facebook] Add another fallback extraction scenario (#11926)
|
||||
* [prosiebensat1] Fix extraction of descriptions (#11810, #11929)
|
||||
- [crunchyroll] Remove ScaledBorderAndShadow settings (#9028)
|
||||
+ [vimeo] Extract upload timestamp
|
||||
+ [vimeo] Extract license (#8726, #11880)
|
||||
+ [nrk:series] Add support for series (#11571, #11711)
|
||||
|
||||
|
||||
version 2017.01.31
|
||||
|
||||
Core
|
||||
+ [compat] Add compat_etree_register_namespace
|
||||
|
||||
Extractors
|
||||
* [youtube] Fix extraction for domainless player URLs (#11890, #11891, #11892,
|
||||
#11894, #11895, #11897, #11900, #11903, #11904, #11906, #11907, #11909,
|
||||
#11913, #11914, #11915, #11916, #11917, #11918, #11919)
|
||||
+ [vimeo] Extract both mixed and separated DASH formats
|
||||
+ [ruutu] Extract DASH formats
|
||||
* [itv] Fix extraction for python 2.6
|
||||
|
||||
|
||||
version 2017.01.29
|
||||
|
||||
Core
|
||||
* [extractor/common] Fix initialization template (#11605, #11825)
|
||||
+ [extractor/common] Document fragment_base_url and fragment's path fields
|
||||
* [extractor/common] Fix duration per DASH segment (#11868)
|
||||
+ Introduce --autonumber-start option for initial value of %(autonumber)s
|
||||
template (#727, #2702, #9362, #10457, #10529, #11862)
|
||||
|
||||
Extractors
|
||||
+ [azmedien:playlist] Add support for topic and themen playlists (#11817)
|
||||
* [npo] Fix subtitles extraction
|
||||
+ [itv] Extract subtitles
|
||||
+ [itv] Add support for itv.com (#9240)
|
||||
+ [mtv81] Add support for mtv81.com (#7619)
|
||||
+ [vlive] Add support for channels (#11826)
|
||||
+ [kaltura] Add fallback for fileExt
|
||||
+ [kaltura] Improve uploader_id extraction
|
||||
+ [konserthusetplay] Add support for rspoplay.se (#11828)
|
||||
|
||||
|
||||
version 2017.01.28
|
||||
|
||||
Core
|
||||
* [utils] Improve parse_duration
|
||||
|
||||
Extractors
|
||||
* [crunchyroll] Improve series and season metadata extraction (#11832)
|
||||
* [soundcloud] Improve formats extraction and extract audio bitrate
|
||||
+ [soundcloud] Extract HLS formats
|
||||
* [soundcloud] Fix track URL extraction (#11852)
|
||||
+ [twitch:vod] Expand URL regular expressions (#11846)
|
||||
* [aenetworks] Fix season episodes extraction (#11669)
|
||||
+ [tva] Add support for videos.tva.ca (#11842)
|
||||
* [jamendo] Improve and extract more metadata (#11836)
|
||||
+ [disney] Add support for Disney sites (#7409, #11801, #4975, #11000)
|
||||
* [vevo] Remove request to old API and catch API v2 errors
|
||||
+ [cmt,mtv,southpark] Add support for episode URLs (#11837)
|
||||
+ [youtube] Add fallback for duration extraction (#11841)
|
||||
|
||||
|
||||
version 2017.01.25
|
||||
|
||||
Extractors
|
||||
+ [openload] Fallback video extension to mp4
|
||||
+ [extractor/generic] Add support for Openload embeds (#11536, #11812)
|
||||
* [srgssr] Fix rts video extraction (#11831)
|
||||
+ [afreecatv:global] Add support for afreeca.tv (#11807)
|
||||
+ [crackle] Extract vtt subtitles
|
||||
+ [crackle] Extract multiple resolutions for thumbnails
|
||||
+ [crackle] Add support for mobile URLs
|
||||
+ [konserthusetplay] Extract subtitles (#11823)
|
||||
+ [konserthusetplay] Add support for HLS videos (#11823)
|
||||
* [vimeo:review] Fix config URL extraction (#11821)
|
||||
|
||||
|
||||
version 2017.01.24
|
||||
|
||||
Extractors
|
||||
* [pluralsight] Fix extraction (#11820)
|
||||
+ [nextmedia] Add support for NextTV (壹電視)
|
||||
* [24video] Fix extraction (#11811)
|
||||
* [youtube:playlist] Fix nonexistent and private playlist detection (#11604)
|
||||
+ [chirbit] Extract uploader (#11809)
|
||||
|
||||
|
||||
version 2017.01.22
|
||||
|
||||
Extractors
|
||||
+ [pornflip] Add support for pornflip.com (#11556, #11795)
|
||||
* [chaturbate] Fix extraction (#11797, #11802)
|
||||
+ [azmedien] Add support for AZ Medien sites (#11784, #11785)
|
||||
+ [nextmedia] Support redirected URLs
|
||||
+ [vimeo:channel] Extract videos' titles for playlist entries (#11796)
|
||||
+ [youtube] Extract episode metadata (#9695, #11774)
|
||||
+ [cspan] Support Ustream embedded videos (#11547)
|
||||
+ [1tv] Add support for HLS videos (#11786)
|
||||
* [uol] Fix extraction (#11770)
|
||||
* [mtv] Relax triforce feed regular expression (#11766)
|
||||
|
||||
|
||||
version 2017.01.18
|
||||
|
||||
Extractors
|
||||
* [bilibili] Fix extraction (#11077)
|
||||
+ [canalplus] Add fallback for video id (#11764)
|
||||
* [20min] Fix extraction (#11683, #11751)
|
||||
* [imdb] Extend URL regular expression (#11744)
|
||||
+ [naver] Add support for tv.naver.com links (#11743)
|
||||
|
||||
|
||||
version 2017.01.16
|
||||
|
||||
Core
|
||||
* [options] Apply custom config to final composite configuration (#11741)
|
||||
* [YoutubeDL] Improve protocol auto determining (#11720)
|
||||
|
||||
Extractors
|
||||
* [xiami] Relax URL regular expressions
|
||||
* [xiami] Improve track metadata extraction (#11699)
|
||||
+ [limelight] Check hand-make direct HTTP links
|
||||
+ [limelight] Add support for direct HTTP links at video.llnw.net (#11737)
|
||||
+ [brightcove] Recognize another player ID pattern (#11688)
|
||||
+ [niconico] Support login via cookies (#7968)
|
||||
* [yourupload] Fix extraction (#11601)
|
||||
+ [beam:live] Add support for beam.pro live streams (#10702, #11596)
|
||||
* [vevo] Improve geo restriction detection
|
||||
+ [dramafever] Add support for URLs with language code (#11714)
|
||||
* [cbc] Improve playlist support (#11704)
|
||||
|
||||
|
||||
version 2017.01.14
|
||||
|
||||
Core
|
||||
+ [common] Add ability to customize akamai manifest host
|
||||
+ [utils] Add more date formats
|
||||
|
||||
Extractors
|
||||
- [mtv] Eliminate _transform_rtmp_url
|
||||
* [mtv] Generalize triforce mgid extraction
|
||||
+ [cmt] Add support for full episodes and video clips (#11623)
|
||||
+ [mitele] Extract DASH formats
|
||||
+ [ooyala] Add support for videos with embedToken (#11684)
|
||||
* [mixcloud] Fix extraction (#11674)
|
||||
* [openload] Fix extraction (#10408)
|
||||
* [tv4] Improve extraction (#11698)
|
||||
* [freesound] Fix and improve extraction (#11602)
|
||||
+ [nick] Add support for beta.nick.com (#11655)
|
||||
* [mtv,cc] Use HLS by default with native HLS downloader (#11641)
|
||||
* [mtv] Fix non-HLS extraction
|
||||
|
||||
|
||||
version 2017.01.10
|
||||
|
||||
Extractors
|
||||
* [youtube] Fix extraction (#11663, #11664)
|
||||
+ [inc] Add support for inc.com (#11277, #11647)
|
||||
+ [youtube] Add itag 212 (#11575)
|
||||
+ [egghead:course] Add support for egghead.io courses
|
||||
|
||||
|
||||
version 2017.01.08
|
||||
|
||||
Core
|
||||
* Fix "invalid escape sequence" errors under Python 3.6 (#11581)
|
||||
|
||||
Extractors
|
||||
+ [hitrecord] Add support for hitrecord.org (#10867, #11626)
|
||||
- [videott] Remove extractor
|
||||
* [swrmediathek] Improve extraction
|
||||
- [sharesix] Remove extractor
|
||||
- [aol:features] Remove extractor
|
||||
* [sendtonews] Improve info extraction
|
||||
* [3sat,phoenix] Fix extraction (#11619)
|
||||
* [comedycentral/mtv] Add support for HLS videos (#11600)
|
||||
* [discoverygo] Fix JSON data parsing (#11219, #11522)
|
||||
|
||||
|
||||
version 2017.01.05
|
||||
|
||||
Extractors
|
||||
+ [zdf] Fix extraction (#11055, #11063)
|
||||
* [pornhub:playlist] Improve extraction (#11594)
|
||||
+ [cctv] Add support for ncpa-classic.com (#11591)
|
||||
+ [tunein] Add support for embeds (#11579)
|
||||
|
||||
|
||||
version 2017.01.02
|
||||
|
||||
Extractors
|
||||
* [cctv] Improve extraction (#879, #6753, #8541)
|
||||
+ [nrktv:episodes] Add support for episodes (#11571)
|
||||
+ [arkena] Add support for video.arkena.com (#11568)
|
||||
|
||||
|
||||
version 2016.12.31
|
||||
|
||||
Core
|
||||
+ Introduce --config-location option for custom configuration files (#6745,
|
||||
#10648)
|
||||
|
||||
Extractors
|
||||
+ [twitch] Add support for player.twitch.tv (#11535, #11537)
|
||||
+ [videa] Add support for videa.hu (#8181, #11133)
|
||||
* [vk] Fix postlive videos extraction
|
||||
* [vk] Extract from playerParams (#11555)
|
||||
- [freevideo] Remove extractor (#11515)
|
||||
+ [showroomlive] Add support for showroom-live.com (#11458)
|
||||
* [xhamster] Fix duration extraction (#11549)
|
||||
* [rtve:live] Fix extraction (#11529)
|
||||
* [brightcove:legacy] Improve embeds detection (#11523)
|
||||
+ [twitch] Add support for rechat messages (#11524)
|
||||
* [acast] Fix audio and timestamp extraction (#11521)
|
||||
|
||||
|
||||
version 2016.12.22
|
||||
|
||||
Core
|
||||
* [extractor/common] Improve detection of video-only formats in m3u8
|
||||
manifests (#11507)
|
||||
|
||||
Extractors
|
||||
+ [theplatform] Pass geo verification headers to SMIL request (#10146)
|
||||
+ [viu] Pass geo verification headers to auth request
|
||||
* [rtl2] Extract more formats and metadata
|
||||
* [vbox7] Skip malformed JSON-LD (#11501)
|
||||
* [uplynk] Force downloading using native HLS downloader (#11496)
|
||||
+ [laola1] Add support for another extraction scenario (#11460)
|
||||
|
||||
|
||||
version 2016.12.20
|
||||
|
||||
Core
|
||||
* [extractor/common] Improve fragment URL construction for DASH media
|
||||
* [extractor/common] Fix codec information extraction for mixed audio/video
|
||||
DASH media (#11490)
|
||||
|
||||
Extractors
|
||||
* [vbox7] Fix extraction (#11494)
|
||||
+ [uktvplay] Add support for uktvplay.uktv.co.uk (#11027)
|
||||
+ [piksel] Add support for player.piksel.com (#11246)
|
||||
+ [vimeo] Add support for DASH formats
|
||||
* [vimeo] Fix extraction for HLS formats (#11490)
|
||||
* [kaltura] Fix wrong widget ID in some cases (#11480)
|
||||
+ [nrktv:direkte] Add support for live streams (#11488)
|
||||
* [pbs] Fix extraction for geo restricted videos (#7095)
|
||||
* [brightcove:new] Skip widevine classic videos
|
||||
+ [viu] Add support for viu.com (#10607, #11329)
|
||||
|
||||
|
||||
version 2016.12.18
|
||||
|
||||
Core
|
||||
+ [extractor/common] Recognize DASH formats in html5 media entries
|
||||
|
||||
Extractors
|
||||
+ [ccma] Add support for ccma.cat (#11359)
|
||||
* [laola1tv] Improve extraction
|
||||
+ [laola1tv] Add support embed URLs (#11460)
|
||||
* [nbc] Fix extraction for MSNBC videos (#11466)
|
||||
* [twitch] Adapt to new videos pages URL schema (#11469)
|
||||
+ [meipai] Add support for meipai.com (#10718)
|
||||
* [jwplatform] Improve subtitles and duration extraction
|
||||
+ [ondemandkorea] Add support for ondemandkorea.com (#10772)
|
||||
+ [vvvvid] Add support for vvvvid.it (#5915)
|
||||
|
||||
|
||||
version 2016.12.15
|
||||
|
||||
Core
|
||||
|
229
README.md
229
README.md
@ -29,7 +29,7 @@ Windows users can [download an .exe file](https://yt-dl.org/latest/youtube-dl.ex
|
||||
|
||||
You can also use pip:
|
||||
|
||||
sudo pip install --upgrade youtube-dl
|
||||
sudo -H pip install --upgrade youtube-dl
|
||||
|
||||
This command will update youtube-dl if you have already installed it. See the [pypi page](https://pypi.python.org/pypi/youtube_dl) for more information.
|
||||
|
||||
@ -44,11 +44,7 @@ Or with [MacPorts](https://www.macports.org/):
|
||||
Alternatively, refer to the [developer instructions](#developer-instructions) for how to check out and work with the git repository. For further options, including PGP signatures, see the [youtube-dl Download Page](https://rg3.github.io/youtube-dl/download.html).
|
||||
|
||||
# DESCRIPTION
|
||||
**youtube-dl** is a command-line program to download videos from
|
||||
YouTube.com and a few more sites. It requires the Python interpreter, version
|
||||
2.6, 2.7, or 3.2+, and it is not platform specific. It should work on
|
||||
your Unix box, on Windows or on Mac OS X. It is released to the public domain,
|
||||
which means you can modify it, redistribute it or use it however you like.
|
||||
**youtube-dl** is a command-line program to download videos from YouTube.com and a few more sites. It requires the Python interpreter, version 2.6, 2.7, or 3.2+, and it is not platform specific. It should work on your Unix box, on Windows or on Mac OS X. It is released to the public domain, which means you can modify it, redistribute it or use it however you like.
|
||||
|
||||
youtube-dl [OPTIONS] URL [URL...]
|
||||
|
||||
@ -84,13 +80,14 @@ which means you can modify it, redistribute it or use it however you like.
|
||||
configuration in ~/.config/youtube-
|
||||
dl/config (%APPDATA%/youtube-dl/config.txt
|
||||
on Windows)
|
||||
--config-location PATH Location of the configuration file; either
|
||||
the path to the config or its containing
|
||||
directory.
|
||||
--flat-playlist Do not extract the videos of a playlist,
|
||||
only list them.
|
||||
--mark-watched Mark videos watched (YouTube only)
|
||||
--no-mark-watched Do not mark videos watched (YouTube only)
|
||||
--no-color Do not emit color codes in output
|
||||
--abort-on-unavailable-fragment Abort downloading when some fragment is not
|
||||
available
|
||||
|
||||
## Network Options:
|
||||
--proxy URL Use the specified HTTP/HTTPS/SOCKS proxy.
|
||||
@ -100,16 +97,23 @@ which means you can modify it, redistribute it or use it however you like.
|
||||
string (--proxy "") for direct connection
|
||||
--socket-timeout SECONDS Time to wait before giving up, in seconds
|
||||
--source-address IP Client-side IP address to bind to
|
||||
(experimental)
|
||||
-4, --force-ipv4 Make all connections via IPv4
|
||||
(experimental)
|
||||
-6, --force-ipv6 Make all connections via IPv6
|
||||
(experimental)
|
||||
|
||||
## Geo Restriction:
|
||||
--geo-verification-proxy URL Use this proxy to verify the IP address for
|
||||
some geo-restricted sites. The default
|
||||
proxy specified by --proxy (or none, if the
|
||||
options is not present) is used for the
|
||||
actual downloading. (experimental)
|
||||
actual downloading.
|
||||
--geo-bypass Bypass geographic restriction via faking
|
||||
X-Forwarded-For HTTP header (experimental)
|
||||
--no-geo-bypass Do not bypass geographic restriction via
|
||||
faking X-Forwarded-For HTTP header
|
||||
(experimental)
|
||||
--geo-bypass-country CODE Force bypass geographic restriction with
|
||||
explicitly provided two-letter ISO 3166-2
|
||||
country code (experimental)
|
||||
|
||||
## Video Selection:
|
||||
--playlist-start NUMBER Playlist video to start at (default is 1)
|
||||
@ -140,16 +144,18 @@ which means you can modify it, redistribute it or use it however you like.
|
||||
COUNT views
|
||||
--max-views COUNT Do not download any videos with more than
|
||||
COUNT views
|
||||
--match-filter FILTER Generic video filter (experimental).
|
||||
Specify any key (see help for -o for a list
|
||||
of available keys) to match if the key is
|
||||
present, !key to check if the key is not
|
||||
present,key > NUMBER (like "comment_count >
|
||||
12", also works with >=, <, <=, !=, =) to
|
||||
compare against a number, and & to require
|
||||
multiple matches. Values which are not
|
||||
known are excluded unless you put a
|
||||
question mark (?) after the operator.For
|
||||
--match-filter FILTER Generic video filter. Specify any key (see
|
||||
help for -o for a list of available keys)
|
||||
to match if the key is present, !key to
|
||||
check if the key is not present, key >
|
||||
NUMBER (like "comment_count > 12", also
|
||||
works with >=, <, <=, !=, =) to compare
|
||||
against a number, key = 'LITERAL' (like
|
||||
"uploader = 'Mike Smith'", also works with
|
||||
!=) to match against a string literal and &
|
||||
to require multiple matches. Values which
|
||||
are not known are excluded unless you put a
|
||||
question mark (?) after the operator. For
|
||||
example, to only match videos that have
|
||||
been liked more than 100 times and disliked
|
||||
less than 50 times (or the dislike
|
||||
@ -179,6 +185,8 @@ which means you can modify it, redistribute it or use it however you like.
|
||||
only)
|
||||
--skip-unavailable-fragments Skip unavailable fragments (DASH and
|
||||
hlsnative only)
|
||||
--abort-on-unavailable-fragment Abort downloading when some fragment is not
|
||||
available
|
||||
--buffer-size SIZE Size of download buffer (e.g. 1024 or 16K)
|
||||
(default is 1024)
|
||||
--no-resize-buffer Do not automatically adjust the buffer
|
||||
@ -186,8 +194,9 @@ which means you can modify it, redistribute it or use it however you like.
|
||||
automatically resized from an initial value
|
||||
of SIZE.
|
||||
--playlist-reverse Download playlist videos in reverse order
|
||||
--playlist-random Download playlist videos in random order
|
||||
--xattr-set-filesize Set file xattribute ytdl.filesize with
|
||||
expected filesize (experimental)
|
||||
expected file size (experimental)
|
||||
--hls-prefer-native Use the native HLS downloader instead of
|
||||
ffmpeg
|
||||
--hls-prefer-ffmpeg Use ffmpeg instead of the native HLS
|
||||
@ -208,19 +217,11 @@ which means you can modify it, redistribute it or use it however you like.
|
||||
--id Use only video ID in file name
|
||||
-o, --output TEMPLATE Output filename template, see the "OUTPUT
|
||||
TEMPLATE" for all the info
|
||||
--autonumber-size NUMBER Specify the number of digits in
|
||||
%(autonumber)s when it is present in output
|
||||
filename template or --auto-number option
|
||||
is given
|
||||
--autonumber-start NUMBER Specify the start value for %(autonumber)s
|
||||
(default is 1)
|
||||
--restrict-filenames Restrict filenames to only ASCII
|
||||
characters, and avoid "&" and spaces in
|
||||
filenames
|
||||
-A, --auto-number [deprecated; use -o
|
||||
"%(autonumber)s-%(title)s.%(ext)s" ] Number
|
||||
downloaded files starting from 00000
|
||||
-t, --title [deprecated] Use title in file name
|
||||
(default)
|
||||
-l, --literal [deprecated] Alias of --title
|
||||
-w, --no-overwrites Do not overwrite files
|
||||
-c, --continue Force resume of partially downloaded files.
|
||||
By default, youtube-dl will resume
|
||||
@ -354,7 +355,7 @@ which means you can modify it, redistribute it or use it however you like.
|
||||
-u, --username USERNAME Login with this account ID
|
||||
-p, --password PASSWORD Account password. If this option is left
|
||||
out, youtube-dl will ask interactively.
|
||||
-2, --twofactor TWOFACTOR Two-factor auth code
|
||||
-2, --twofactor TWOFACTOR Two-factor authentication code
|
||||
-n, --netrc Use .netrc authentication data
|
||||
--video-password PASSWORD Video password (vimeo, smotri, youku)
|
||||
|
||||
@ -375,7 +376,7 @@ which means you can modify it, redistribute it or use it however you like.
|
||||
avprobe)
|
||||
--audio-format FORMAT Specify audio format: "best", "aac",
|
||||
"vorbis", "mp3", "m4a", "opus", or "wav";
|
||||
"best" by default
|
||||
"best" by default; No effect without -x
|
||||
--audio-quality QUALITY Specify ffmpeg/avconv audio quality, insert
|
||||
a value between 0 (better) and 9 (worse)
|
||||
for VBR or a specific bitrate like 128K
|
||||
@ -447,6 +448,8 @@ Note that options in configuration file are just the same options aka switches u
|
||||
|
||||
You can use `--ignore-config` if you want to disable the configuration file for a particular youtube-dl run.
|
||||
|
||||
You can also use `--config-location` if you want to use custom configuration file for a particular youtube-dl run.
|
||||
|
||||
### Authentication with `.netrc` file
|
||||
|
||||
You may also want to configure automatic credentials storage for extractors that support authentication (by providing login and password with `--username` and `--password`) in order not to pass credentials as command line arguments on every youtube-dl execution and prevent tracking plain text passwords in the shell command history. You can achieve this using a [`.netrc` file](http://stackoverflow.com/tags/.netrc/info) on a per extractor basis. For that you will need to create a `.netrc` file in your `$HOME` and restrict permissions to read/write by only you:
|
||||
@ -473,87 +476,89 @@ The `-o` option allows users to indicate a template for the output file names.
|
||||
|
||||
**tl;dr:** [navigate me to examples](#output-template-examples).
|
||||
|
||||
The basic usage is not to set any template arguments when downloading a single file, like in `youtube-dl -o funny_video.flv "http://some/video"`. However, it may contain special sequences that will be replaced when downloading each video. The special sequences have the format `%(NAME)s`. To clarify, that is a percent symbol followed by a name in parentheses, followed by a lowercase S. Allowed names are:
|
||||
The basic usage is not to set any template arguments when downloading a single file, like in `youtube-dl -o funny_video.flv "http://some/video"`. However, it may contain special sequences that will be replaced when downloading each video. The special sequences may be formatted according to [python string formatting operations](https://docs.python.org/2/library/stdtypes.html#string-formatting). For example, `%(NAME)s` or `%(NAME)05d`. To clarify, that is a percent symbol followed by a name in parentheses, followed by a formatting operations. Allowed names along with sequence type are:
|
||||
|
||||
- `id`: Video identifier
|
||||
- `title`: Video title
|
||||
- `url`: Video URL
|
||||
- `ext`: Video filename extension
|
||||
- `alt_title`: A secondary title of the video
|
||||
- `display_id`: An alternative identifier for the video
|
||||
- `uploader`: Full name of the video uploader
|
||||
- `license`: License name the video is licensed under
|
||||
- `creator`: The creator of the video
|
||||
- `release_date`: The date (YYYYMMDD) when the video was released
|
||||
- `timestamp`: UNIX timestamp of the moment the video became available
|
||||
- `upload_date`: Video upload date (YYYYMMDD)
|
||||
- `uploader_id`: Nickname or id of the video uploader
|
||||
- `location`: Physical location where the video was filmed
|
||||
- `duration`: Length of the video in seconds
|
||||
- `view_count`: How many users have watched the video on the platform
|
||||
- `like_count`: Number of positive ratings of the video
|
||||
- `dislike_count`: Number of negative ratings of the video
|
||||
- `repost_count`: Number of reposts of the video
|
||||
- `average_rating`: Average rating give by users, the scale used depends on the webpage
|
||||
- `comment_count`: Number of comments on the video
|
||||
- `age_limit`: Age restriction for the video (years)
|
||||
- `format`: A human-readable description of the format
|
||||
- `format_id`: Format code specified by `--format`
|
||||
- `format_note`: Additional info about the format
|
||||
- `width`: Width of the video
|
||||
- `height`: Height of the video
|
||||
- `resolution`: Textual description of width and height
|
||||
- `tbr`: Average bitrate of audio and video in KBit/s
|
||||
- `abr`: Average audio bitrate in KBit/s
|
||||
- `acodec`: Name of the audio codec in use
|
||||
- `asr`: Audio sampling rate in Hertz
|
||||
- `vbr`: Average video bitrate in KBit/s
|
||||
- `fps`: Frame rate
|
||||
- `vcodec`: Name of the video codec in use
|
||||
- `container`: Name of the container format
|
||||
- `filesize`: The number of bytes, if known in advance
|
||||
- `filesize_approx`: An estimate for the number of bytes
|
||||
- `protocol`: The protocol that will be used for the actual download
|
||||
- `extractor`: Name of the extractor
|
||||
- `extractor_key`: Key name of the extractor
|
||||
- `epoch`: Unix epoch when creating the file
|
||||
- `autonumber`: Five-digit number that will be increased with each download, starting at zero
|
||||
- `playlist`: Name or id of the playlist that contains the video
|
||||
- `playlist_index`: Index of the video in the playlist padded with leading zeros according to the total length of the playlist
|
||||
- `playlist_id`: Playlist identifier
|
||||
- `playlist_title`: Playlist title
|
||||
- `id` (string): Video identifier
|
||||
- `title` (string): Video title
|
||||
- `url` (string): Video URL
|
||||
- `ext` (string): Video filename extension
|
||||
- `alt_title` (string): A secondary title of the video
|
||||
- `display_id` (string): An alternative identifier for the video
|
||||
- `uploader` (string): Full name of the video uploader
|
||||
- `license` (string): License name the video is licensed under
|
||||
- `creator` (string): The creator of the video
|
||||
- `release_date` (string): The date (YYYYMMDD) when the video was released
|
||||
- `timestamp` (numeric): UNIX timestamp of the moment the video became available
|
||||
- `upload_date` (string): Video upload date (YYYYMMDD)
|
||||
- `uploader_id` (string): Nickname or id of the video uploader
|
||||
- `location` (string): Physical location where the video was filmed
|
||||
- `duration` (numeric): Length of the video in seconds
|
||||
- `view_count` (numeric): How many users have watched the video on the platform
|
||||
- `like_count` (numeric): Number of positive ratings of the video
|
||||
- `dislike_count` (numeric): Number of negative ratings of the video
|
||||
- `repost_count` (numeric): Number of reposts of the video
|
||||
- `average_rating` (numeric): Average rating give by users, the scale used depends on the webpage
|
||||
- `comment_count` (numeric): Number of comments on the video
|
||||
- `age_limit` (numeric): Age restriction for the video (years)
|
||||
- `format` (string): A human-readable description of the format
|
||||
- `format_id` (string): Format code specified by `--format`
|
||||
- `format_note` (string): Additional info about the format
|
||||
- `width` (numeric): Width of the video
|
||||
- `height` (numeric): Height of the video
|
||||
- `resolution` (string): Textual description of width and height
|
||||
- `tbr` (numeric): Average bitrate of audio and video in KBit/s
|
||||
- `abr` (numeric): Average audio bitrate in KBit/s
|
||||
- `acodec` (string): Name of the audio codec in use
|
||||
- `asr` (numeric): Audio sampling rate in Hertz
|
||||
- `vbr` (numeric): Average video bitrate in KBit/s
|
||||
- `fps` (numeric): Frame rate
|
||||
- `vcodec` (string): Name of the video codec in use
|
||||
- `container` (string): Name of the container format
|
||||
- `filesize` (numeric): The number of bytes, if known in advance
|
||||
- `filesize_approx` (numeric): An estimate for the number of bytes
|
||||
- `protocol` (string): The protocol that will be used for the actual download
|
||||
- `extractor` (string): Name of the extractor
|
||||
- `extractor_key` (string): Key name of the extractor
|
||||
- `epoch` (numeric): Unix epoch when creating the file
|
||||
- `autonumber` (numeric): Five-digit number that will be increased with each download, starting at zero
|
||||
- `playlist` (string): Name or id of the playlist that contains the video
|
||||
- `playlist_index` (numeric): Index of the video in the playlist padded with leading zeros according to the total length of the playlist
|
||||
- `playlist_id` (string): Playlist identifier
|
||||
- `playlist_title` (string): Playlist title
|
||||
|
||||
|
||||
Available for the video that belongs to some logical chapter or section:
|
||||
- `chapter`: Name or title of the chapter the video belongs to
|
||||
- `chapter_number`: Number of the chapter the video belongs to
|
||||
- `chapter_id`: Id of the chapter the video belongs to
|
||||
- `chapter` (string): Name or title of the chapter the video belongs to
|
||||
- `chapter_number` (numeric): Number of the chapter the video belongs to
|
||||
- `chapter_id` (string): Id of the chapter the video belongs to
|
||||
|
||||
Available for the video that is an episode of some series or programme:
|
||||
- `series`: Title of the series or programme the video episode belongs to
|
||||
- `season`: Title of the season the video episode belongs to
|
||||
- `season_number`: Number of the season the video episode belongs to
|
||||
- `season_id`: Id of the season the video episode belongs to
|
||||
- `episode`: Title of the video episode
|
||||
- `episode_number`: Number of the video episode within a season
|
||||
- `episode_id`: Id of the video episode
|
||||
- `series` (string): Title of the series or programme the video episode belongs to
|
||||
- `season` (string): Title of the season the video episode belongs to
|
||||
- `season_number` (numeric): Number of the season the video episode belongs to
|
||||
- `season_id` (string): Id of the season the video episode belongs to
|
||||
- `episode` (string): Title of the video episode
|
||||
- `episode_number` (numeric): Number of the video episode within a season
|
||||
- `episode_id` (string): Id of the video episode
|
||||
|
||||
Available for the media that is a track or a part of a music album:
|
||||
- `track`: Title of the track
|
||||
- `track_number`: Number of the track within an album or a disc
|
||||
- `track_id`: Id of the track
|
||||
- `artist`: Artist(s) of the track
|
||||
- `genre`: Genre(s) of the track
|
||||
- `album`: Title of the album the track belongs to
|
||||
- `album_type`: Type of the album
|
||||
- `album_artist`: List of all artists appeared on the album
|
||||
- `disc_number`: Number of the disc or other physical medium the track belongs to
|
||||
- `release_year`: Year (YYYY) when the album was released
|
||||
- `track` (string): Title of the track
|
||||
- `track_number` (numeric): Number of the track within an album or a disc
|
||||
- `track_id` (string): Id of the track
|
||||
- `artist` (string): Artist(s) of the track
|
||||
- `genre` (string): Genre(s) of the track
|
||||
- `album` (string): Title of the album the track belongs to
|
||||
- `album_type` (string): Type of the album
|
||||
- `album_artist` (string): List of all artists appeared on the album
|
||||
- `disc_number` (numeric): Number of the disc or other physical medium the track belongs to
|
||||
- `release_year` (numeric): Year (YYYY) when the album was released
|
||||
|
||||
Each aforementioned sequence when referenced in an output template will be replaced by the actual value corresponding to the sequence name. Note that some of the sequences are not guaranteed to be present since they depend on the metadata obtained by a particular extractor. Such sequences will be replaced with `NA`.
|
||||
|
||||
For example for `-o %(title)s-%(id)s.%(ext)s` and an mp4 video with title `youtube-dl test video` and id `BaW_jenozKcj`, this will result in a `youtube-dl test video-BaW_jenozKcj.mp4` file created in the current directory.
|
||||
|
||||
For numeric sequences you can use numeric related formatting, for example, `%(view_count)05d` will result in a string with view count padded with zeros up to 5 characters, like in `00042`.
|
||||
|
||||
Output templates can also contain arbitrary hierarchical path, e.g. `-o '%(playlist)s/%(playlist_index)s - %(title)s.%(ext)s'` which will result in downloading each video in a directory corresponding to this path template. Any missing directory will be automatically created for you.
|
||||
|
||||
To use percent literals in an output template use `%%`. To output to stdout use `-o -`.
|
||||
@ -638,7 +643,7 @@ Also filtering work for comparisons `=` (equals), `!=` (not equals), `^=` (begin
|
||||
- `acodec`: Name of the audio codec in use
|
||||
- `vcodec`: Name of the video codec in use
|
||||
- `container`: Name of the container format
|
||||
- `protocol`: The protocol that will be used for the actual download, lower-case. `http`, `https`, `rtsp`, `rtmp`, `rtmpe`, `m3u8`, or `m3u8_native`
|
||||
- `protocol`: The protocol that will be used for the actual download, lower-case (`http`, `https`, `rtsp`, `rtmp`, `rtmpe`, `mms`, `f4m`, `ism`, `m3u8`, or `m3u8_native`)
|
||||
- `format_id`: A short description of the format
|
||||
|
||||
Note that none of the aforementioned meta fields are guaranteed to be present since this solely depends on the metadata obtained by particular extractor, i.e. the metadata offered by the video hoster.
|
||||
@ -744,7 +749,7 @@ Most people asking this question are not aware that youtube-dl now defaults to d
|
||||
|
||||
### I get HTTP error 402 when trying to download a video. What's this?
|
||||
|
||||
Apparently YouTube requires you to pass a CAPTCHA test if you download too much. We're [considering to provide a way to let you solve the CAPTCHA](https://github.com/rg3/youtube-dl/issues/154), but at the moment, your best course of action is pointing a webbrowser to the youtube URL, solving the CAPTCHA, and restart youtube-dl.
|
||||
Apparently YouTube requires you to pass a CAPTCHA test if you download too much. We're [considering to provide a way to let you solve the CAPTCHA](https://github.com/rg3/youtube-dl/issues/154), but at the moment, your best course of action is pointing a web browser to the youtube URL, solving the CAPTCHA, and restart youtube-dl.
|
||||
|
||||
### Do I need any other programs?
|
||||
|
||||
@ -756,7 +761,7 @@ Videos or video formats streamed via RTMP protocol can only be downloaded when [
|
||||
|
||||
Once the video is fully downloaded, use any video player, such as [mpv](https://mpv.io/), [vlc](http://www.videolan.org/) or [mplayer](http://www.mplayerhq.hu/).
|
||||
|
||||
### I extracted a video URL with `-g`, but it does not play on another machine / in my webbrowser.
|
||||
### I extracted a video URL with `-g`, but it does not play on another machine / in my web browser.
|
||||
|
||||
It depends a lot on the service. In many cases, requests for the video (to download/play it) must come from the same IP address and with the same cookies and/or HTTP headers. Use the `--cookies` option to write the required cookies into a file, and advise your downloader to read cookies from that file. Some sites also require a common user agent to be used, use `--dump-user-agent` to see the one in use by youtube-dl. You can also get necessary cookies and HTTP headers from JSON output obtained with `--dump-json`.
|
||||
|
||||
@ -840,7 +845,7 @@ Use the `--cookies` option, for example `--cookies /path/to/cookies/file.txt`.
|
||||
|
||||
In order to extract cookies from browser use any conforming browser extension for exporting cookies. For example, [cookies.txt](https://chrome.google.com/webstore/detail/cookiestxt/njabckikapfpffapmjgojcnbfjonfjfg) (for Chrome) or [Export Cookies](https://addons.mozilla.org/en-US/firefox/addon/export-cookies/) (for Firefox).
|
||||
|
||||
Note that the cookies file must be in Mozilla/Netscape format and the first line of the cookies file must be either `# HTTP Cookie File` or `# Netscape HTTP Cookie File`. Make sure you have correct [newline format](https://en.wikipedia.org/wiki/Newline) in the cookies file and convert newlines if necessary to correspond with your OS, namely `CRLF` (`\r\n`) for Windows, `LF` (`\n`) for Linux and `CR` (`\r`) for Mac OS. `HTTP Error 400: Bad Request` when using `--cookies` is a good sign of invalid newline format.
|
||||
Note that the cookies file must be in Mozilla/Netscape format and the first line of the cookies file must be either `# HTTP Cookie File` or `# Netscape HTTP Cookie File`. Make sure you have correct [newline format](https://en.wikipedia.org/wiki/Newline) in the cookies file and convert newlines if necessary to correspond with your OS, namely `CRLF` (`\r\n`) for Windows and `LF` (`\n`) for Unix and Unix-like systems (Linux, Mac OS, etc.). `HTTP Error 400: Bad Request` when using `--cookies` is a good sign of invalid newline format.
|
||||
|
||||
Passing cookies to youtube-dl is a good way to workaround login when a particular extractor does not implement it explicitly. Another use case is working around [CAPTCHA](https://en.wikipedia.org/wiki/CAPTCHA) some websites require you to solve in particular cases in order to get access (e.g. YouTube, CloudFlare).
|
||||
|
||||
@ -932,7 +937,7 @@ If you want to create a build of youtube-dl yourself, you'll need
|
||||
|
||||
If you want to add support for a new site, first of all **make sure** this site is **not dedicated to [copyright infringement](README.md#can-you-add-support-for-this-anime-video-site-or-site-which-shows-current-movies-for-free)**. youtube-dl does **not support** such sites thus pull requests adding support for them **will be rejected**.
|
||||
|
||||
After you have ensured this site is distributing it's content legally, you can follow this quick list (assuming your service is called `yourextractor`):
|
||||
After you have ensured this site is distributing its content legally, you can follow this quick list (assuming your service is called `yourextractor`):
|
||||
|
||||
1. [Fork this repository](https://github.com/rg3/youtube-dl/fork)
|
||||
2. Check out the source code with:
|
||||
@ -962,7 +967,7 @@ After you have ensured this site is distributing it's content legally, you can f
|
||||
'id': '42',
|
||||
'ext': 'mp4',
|
||||
'title': 'Video title goes here',
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
# TODO more properties, either as:
|
||||
# * A value
|
||||
# * MD5 checksum; start the string with md5:
|
||||
@ -1037,7 +1042,7 @@ Assume at this point `meta`'s layout is:
|
||||
}
|
||||
```
|
||||
|
||||
Assume you want to extract `summary` and put it into the resulting info dict as `description`. Since `description` is an optional metafield you should be ready that this key may be missing from the `meta` dict, so that you should extract it like:
|
||||
Assume you want to extract `summary` and put it into the resulting info dict as `description`. Since `description` is an optional meta field you should be ready that this key may be missing from the `meta` dict, so that you should extract it like:
|
||||
|
||||
```python
|
||||
description = meta.get('summary') # correct
|
||||
@ -1149,7 +1154,7 @@ with youtube_dl.YoutubeDL(ydl_opts) as ydl:
|
||||
ydl.download(['http://www.youtube.com/watch?v=BaW_jenozKc'])
|
||||
```
|
||||
|
||||
Most likely, you'll want to use various options. For a list of options available, have a look at [`youtube_dl/YoutubeDL.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/YoutubeDL.py#L128-L278). For a start, if you want to intercept youtube-dl's output, set a `logger` object.
|
||||
Most likely, you'll want to use various options. For a list of options available, have a look at [`youtube_dl/YoutubeDL.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/YoutubeDL.py#L129-L279). For a start, if you want to intercept youtube-dl's output, set a `logger` object.
|
||||
|
||||
Here's a more complete example of a program that outputs only errors (and a short message after the download is finished), and downloads/converts the video to an mp3 file:
|
||||
|
||||
@ -1252,7 +1257,7 @@ We are then presented with a very complicated request when the original problem
|
||||
|
||||
Some of our users seem to think there is a limit of issues they can or should open. There is no limit of issues they can or should open. While it may seem appealing to be able to dump all your issues into one ticket, that means that someone who solves one of your issues cannot mark the issue as closed. Typically, reporting a bunch of issues leads to the ticket lingering since nobody wants to attack that behemoth, until someone mercifully splits the issue into multiple ones.
|
||||
|
||||
In particular, every site support request issue should only pertain to services at one site (generally under a common domain, but always using the same backend technology). Do not request support for vimeo user videos, Whitehouse podcasts, and Google Plus pages in the same issue. Also, make sure that you don't post bug reports alongside feature requests. As a rule of thumb, a feature request does not include outputs of youtube-dl that are not immediately related to the feature at hand. Do not post reports of a network error alongside the request for a new video service.
|
||||
In particular, every site support request issue should only pertain to services at one site (generally under a common domain, but always using the same backend technology). Do not request support for vimeo user videos, White house podcasts, and Google Plus pages in the same issue. Also, make sure that you don't post bug reports alongside feature requests. As a rule of thumb, a feature request does not include outputs of youtube-dl that are not immediately related to the feature at hand. Do not post reports of a network error alongside the request for a new video service.
|
||||
|
||||
### Is anyone going to need the feature?
|
||||
|
||||
|
@ -424,8 +424,6 @@ class BuildHTTPRequestHandler(compat_http_server.BaseHTTPRequestHandler):
|
||||
self.send_header('Content-Length', len(msg))
|
||||
self.end_headers()
|
||||
self.wfile.write(msg)
|
||||
except HTTPError as e:
|
||||
self.send_response(e.code, str(e))
|
||||
else:
|
||||
self.send_response(500, 'Unknown build method "%s"' % action)
|
||||
else:
|
||||
|
@ -1,6 +1,7 @@
|
||||
from __future__ import unicode_literals, print_function
|
||||
|
||||
from inspect import getsource
|
||||
import io
|
||||
import os
|
||||
from os.path import dirname as dirn
|
||||
import sys
|
||||
@ -95,5 +96,5 @@ module_contents.append(
|
||||
|
||||
module_src = '\n'.join(module_contents) + '\n'
|
||||
|
||||
with open(lazy_extractors_filename, 'wt') as f:
|
||||
with io.open(lazy_extractors_filename, 'wt', encoding='utf-8') as f:
|
||||
f.write(module_src)
|
||||
|
21
devscripts/run_tests.sh
Executable file
21
devscripts/run_tests.sh
Executable file
@ -0,0 +1,21 @@
|
||||
#!/bin/bash
|
||||
|
||||
DOWNLOAD_TESTS="age_restriction|download|subtitles|write_annotations|iqiyi_sdk_interpreter|youtube_lists"
|
||||
|
||||
test_set=""
|
||||
multiprocess_args=""
|
||||
|
||||
case "$YTDL_TEST_SET" in
|
||||
core)
|
||||
test_set="-I test_($DOWNLOAD_TESTS)\.py"
|
||||
;;
|
||||
download)
|
||||
test_set="-I test_(?!$DOWNLOAD_TESTS).+\.py"
|
||||
multiprocess_args="--processes=4 --process-timeout=540"
|
||||
;;
|
||||
*)
|
||||
break
|
||||
;;
|
||||
esac
|
||||
|
||||
nosetests test --verbose $test_set $multiprocess_args
|
@ -11,6 +11,7 @@
|
||||
- **4tube**
|
||||
- **56.com**
|
||||
- **5min**
|
||||
- **6play**
|
||||
- **8tracks**
|
||||
- **91porn**
|
||||
- **9c9media**
|
||||
@ -33,7 +34,8 @@
|
||||
- **AdobeTVVideo**
|
||||
- **AdultSwim**
|
||||
- **aenetworks**: A+E Networks: A&E, Lifetime, History.com, FYI Network
|
||||
- **AfreecaTV**: afreecatv.com
|
||||
- **afreecatv**: afreecatv.com
|
||||
- **afreecatv:global**: afreecatv.com
|
||||
- **AirMozilla**
|
||||
- **AlJazeera**
|
||||
- **Allocine**
|
||||
@ -74,6 +76,9 @@
|
||||
- **awaan:live**
|
||||
- **awaan:season**
|
||||
- **awaan:video**
|
||||
- **AZMedien**: AZ Medien videos
|
||||
- **AZMedienPlaylist**: AZ Medien playlists
|
||||
- **AZMedienShowPlaylist**: AZ Medien show playlists
|
||||
- **Azubu**
|
||||
- **AzubuLive**
|
||||
- **BaiduVideo**: 百度视频
|
||||
@ -81,11 +86,13 @@
|
||||
- **bambuser:channel**
|
||||
- **Bandcamp**
|
||||
- **Bandcamp:album**
|
||||
- **bangumi.bilibili.com**: BiliBili番剧
|
||||
- **bbc**: BBC
|
||||
- **bbc.co.uk**: BBC iPlayer
|
||||
- **bbc.co.uk:article**: BBC articles
|
||||
- **bbc.co.uk:iplayer:playlist**
|
||||
- **bbc.co.uk:playlist**
|
||||
- **Beam:live**
|
||||
- **Beatport**
|
||||
- **Beeg**
|
||||
- **BehindKink**
|
||||
@ -131,7 +138,8 @@
|
||||
- **cbsnews**: CBS News
|
||||
- **cbsnews:livevideo**: CBS News Live Videos
|
||||
- **CBSSports**
|
||||
- **CCTV**
|
||||
- **CCMA**
|
||||
- **CCTV**: 央视网
|
||||
- **CDA**
|
||||
- **CeskaTelevize**
|
||||
- **channel9**: Channel 9
|
||||
@ -162,6 +170,7 @@
|
||||
- **ComedyCentralShortname**
|
||||
- **ComedyCentralTV**
|
||||
- **CondeNast**: Condé Nast media group: Allure, Architectural Digest, Ars Technica, Bon Appétit, Brides, Condé Nast, Condé Nast Traveler, Details, Epicurious, GQ, Glamour, Golf Digest, SELF, Teen Vogue, The New Yorker, Vanity Fair, Vogue, W Magazine, WIRED
|
||||
- **Corus**
|
||||
- **Coub**
|
||||
- **Cracked**
|
||||
- **Crackle**
|
||||
@ -183,6 +192,8 @@
|
||||
- **dailymotion:playlist**
|
||||
- **dailymotion:user**
|
||||
- **DailymotionCloud**
|
||||
- **Daisuki**
|
||||
- **DaisukiPlaylist**
|
||||
- **daum.net**
|
||||
- **daum.net:clip**
|
||||
- **daum.net:playlist**
|
||||
@ -197,6 +208,7 @@
|
||||
- **Digiteka**
|
||||
- **Discovery**
|
||||
- **DiscoveryGo**
|
||||
- **Disney**
|
||||
- **Dotsub**
|
||||
- **DouyuTV**: 斗鱼
|
||||
- **DPlay**
|
||||
@ -205,7 +217,8 @@
|
||||
- **DRBonanza**
|
||||
- **Dropbox**
|
||||
- **DrTuber**
|
||||
- **DRTV**
|
||||
- **drtv**
|
||||
- **drtv:live**
|
||||
- **Dumpert**
|
||||
- **dvtv**: http://video.aktualne.cz/
|
||||
- **dw**
|
||||
@ -213,6 +226,7 @@
|
||||
- **EaglePlatform**
|
||||
- **EbaumsWorld**
|
||||
- **EchoMsk**
|
||||
- **egghead:course**: egghead.io course
|
||||
- **eHow**
|
||||
- **Einthusan**
|
||||
- **eitb.tv**
|
||||
@ -228,6 +242,7 @@
|
||||
- **ESPN**
|
||||
- **ESPNArticle**
|
||||
- **EsriVideo**
|
||||
- **ETOnline**
|
||||
- **Europa**
|
||||
- **EveryonesMixtape**
|
||||
- **ExpoTV**
|
||||
@ -239,8 +254,9 @@
|
||||
- **fc2**
|
||||
- **fc2:embed**
|
||||
- **Fczenit**
|
||||
- **features.aol.com**
|
||||
- **fernsehkritik.tv**
|
||||
- **filmon**
|
||||
- **filmon:channel**
|
||||
- **Firstpost**
|
||||
- **FiveTV**
|
||||
- **Flickr**
|
||||
@ -262,7 +278,7 @@
|
||||
- **francetvinfo.fr**
|
||||
- **Freesound**
|
||||
- **freespeech.org**
|
||||
- **FreeVideo**
|
||||
- **FreshLive**
|
||||
- **Funimation**
|
||||
- **FunnyOrDie**
|
||||
- **Fusion**
|
||||
@ -273,6 +289,7 @@
|
||||
- **Gamersyde**
|
||||
- **GameSpot**
|
||||
- **GameStar**
|
||||
- **Gaskrank**
|
||||
- **Gazeta**
|
||||
- **GDCVault**
|
||||
- **generic**: Generic downloader that works on some sites
|
||||
@ -298,12 +315,13 @@
|
||||
- **HellPorno**
|
||||
- **Helsinki**: helsinki.fi
|
||||
- **HentaiStigma**
|
||||
- **HGTV**
|
||||
- **hetklokhuis**
|
||||
- **hgtv.com:show**
|
||||
- **HistoricFilms**
|
||||
- **history:topic**: History.com Topic
|
||||
- **hitbox**
|
||||
- **hitbox:live**
|
||||
- **HitRecord**
|
||||
- **HornBunny**
|
||||
- **HotNewHipHop**
|
||||
- **HotStar**
|
||||
@ -321,6 +339,7 @@
|
||||
- **Imgur**
|
||||
- **ImgurAlbum**
|
||||
- **Ina**
|
||||
- **Inc**
|
||||
- **Indavideo**
|
||||
- **IndavideoEmbed**
|
||||
- **InfoQ**
|
||||
@ -330,6 +349,7 @@
|
||||
- **IPrima**
|
||||
- **iqiyi**: 爱奇艺
|
||||
- **Ir90Tv**
|
||||
- **ITV**
|
||||
- **ivi**: ivi.ru
|
||||
- **ivi:compilation**: ivi.ru compilations
|
||||
- **ivideon**: Ivideon TV
|
||||
@ -364,7 +384,8 @@
|
||||
- **kuwo:singer**: 酷我音乐 - 歌手
|
||||
- **kuwo:song**: 酷我音乐
|
||||
- **la7.it**
|
||||
- **Laola1Tv**
|
||||
- **laola1tv**
|
||||
- **laola1tv:embed**
|
||||
- **LCI**
|
||||
- **Lcp**
|
||||
- **LcpPlay**
|
||||
@ -402,6 +423,7 @@
|
||||
- **MatchTV**
|
||||
- **MDR**: MDR.DE and KiKA
|
||||
- **media.ccc.de**
|
||||
- **Meipai**: 美拍
|
||||
- **MelonVOD**
|
||||
- **META**
|
||||
- **metacafe**
|
||||
@ -436,6 +458,7 @@
|
||||
- **mtg**: MTG services
|
||||
- **mtv**
|
||||
- **mtv.de**
|
||||
- **mtv81**
|
||||
- **mtv:video**
|
||||
- **mtvservices:embedded**
|
||||
- **MuenchenTV**: münchen.tv
|
||||
@ -478,6 +501,7 @@
|
||||
- **Newstube**
|
||||
- **NextMedia**: 蘋果日報
|
||||
- **NextMediaActionNews**: 蘋果日報 - 動新聞
|
||||
- **NextTV**: 壹電視
|
||||
- **nfb**: National Film Board of Canada
|
||||
- **nfl.com**
|
||||
- **NhkVod**
|
||||
@ -493,6 +517,7 @@
|
||||
- **Nintendo**
|
||||
- **njoy**: N-JOY
|
||||
- **njoy:embed**
|
||||
- **NJPWWorld**: 新日本プロレスワールド
|
||||
- **NobelPrize**
|
||||
- **Noco**
|
||||
- **Normalboots**
|
||||
@ -514,6 +539,9 @@
|
||||
- **NRKPlaylist**
|
||||
- **NRKSkole**: NRK Skole
|
||||
- **NRKTV**: NRK TV and NRK Radio
|
||||
- **NRKTVDirekte**: NRK TV Direkte and NRK Radio Direkte
|
||||
- **NRKTVEpisodes**
|
||||
- **NRKTVSeries**
|
||||
- **ntv.ru**
|
||||
- **Nuvid**
|
||||
- **NYTimes**
|
||||
@ -524,8 +552,11 @@
|
||||
- **Odnoklassniki**
|
||||
- **OktoberfestTV**
|
||||
- **on.aol.com**
|
||||
- **OnDemandKorea**
|
||||
- **onet.pl**
|
||||
- **onet.tv**
|
||||
- **onet.tv:channel**
|
||||
- **OnetMVP**
|
||||
- **OnionStudios**
|
||||
- **Ooyala**
|
||||
- **OoyalaExternal**
|
||||
@ -547,6 +578,7 @@
|
||||
- **PhilharmonieDeParis**: Philharmonie de Paris
|
||||
- **phoenix.de**
|
||||
- **Photobucket**
|
||||
- **Piksel**
|
||||
- **Pinkbike**
|
||||
- **Pladform**
|
||||
- **play.fm**
|
||||
@ -563,6 +595,7 @@
|
||||
- **PolskieRadio**
|
||||
- **PolskieRadioCategory**
|
||||
- **PornCom**
|
||||
- **PornFlip**
|
||||
- **PornHd**
|
||||
- **PornHub**: PornHub and Thumbzilla
|
||||
- **PornHubPlaylist**
|
||||
@ -593,6 +626,7 @@
|
||||
- **RaiTV**
|
||||
- **RBMARadio**
|
||||
- **RDS**: RDS.ca
|
||||
- **RedBullTV**
|
||||
- **RedTube**
|
||||
- **RegioTV**
|
||||
- **RENTV**
|
||||
@ -640,11 +674,10 @@
|
||||
- **savefrom.net**
|
||||
- **SBS**: sbs.com.au
|
||||
- **schooltv**
|
||||
- **SciVee**
|
||||
- **screen.yahoo:search**: Yahoo screen search
|
||||
- **Screencast**
|
||||
- **ScreencastOMatic**
|
||||
- **ScreenJunkies**
|
||||
- **scrippsnetworks:watch**
|
||||
- **Seeker**
|
||||
- **SenateISVP**
|
||||
- **SendtoNews**
|
||||
@ -652,9 +685,9 @@
|
||||
- **Sexu**
|
||||
- **Shahid**
|
||||
- **Shared**: shared.sx
|
||||
- **ShareSix**
|
||||
- **ShowRoomLive**
|
||||
- **Sina**
|
||||
- **SixPlay**
|
||||
- **SkylineWebcams**
|
||||
- **skynewsarabia:article**
|
||||
- **skynewsarabia:video**
|
||||
- **SkySports**
|
||||
@ -686,10 +719,10 @@
|
||||
- **Spiegeltv**
|
||||
- **Spike**
|
||||
- **Sport5**
|
||||
- **SportBox**
|
||||
- **SportBoxEmbed**
|
||||
- **SportDeutschland**
|
||||
- **Sportschau**
|
||||
- **Sprout**
|
||||
- **sr:mediathek**: Saarländischer Rundfunk
|
||||
- **SRGSSR**
|
||||
- **SRGSSRPlay**: srf.ch, rts.ch, rsi.ch, rtr.ch and swissinfo.ch play sites
|
||||
@ -765,6 +798,7 @@
|
||||
- **tunein:program**
|
||||
- **tunein:station**
|
||||
- **tunein:topic**
|
||||
- **TunePk**
|
||||
- **Turbo**
|
||||
- **Tutv**
|
||||
- **tv.dfb.de**
|
||||
@ -772,23 +806,29 @@
|
||||
- **TV2Article**
|
||||
- **TV3**
|
||||
- **TV4**: tv4.se and tv4play.se
|
||||
- **TVA**
|
||||
- **TVANouvelles**
|
||||
- **TVANouvellesArticle**
|
||||
- **TVC**
|
||||
- **TVCArticle**
|
||||
- **tvigle**: Интернет-телевидение Tvigle.ru
|
||||
- **tvland.com**
|
||||
- **TVN24**
|
||||
- **TVNoe**
|
||||
- **tvp**: Telewizja Polska
|
||||
- **tvp:embed**: Telewizja Polska
|
||||
- **tvp:series**
|
||||
- **TVPlayer**
|
||||
- **Tweakers**
|
||||
- **twitch:chapter**
|
||||
- **twitch:clips**
|
||||
- **twitch:past_broadcasts**
|
||||
- **twitch:profile**
|
||||
- **twitch:stream**
|
||||
- **twitch:video**
|
||||
- **twitch:videos:all**
|
||||
- **twitch:videos:highlights**
|
||||
- **twitch:videos:past-broadcasts**
|
||||
- **twitch:videos:uploads**
|
||||
- **twitch:vod**
|
||||
- **twitter**
|
||||
- **twitter:amplify**
|
||||
@ -796,6 +836,7 @@
|
||||
- **udemy**
|
||||
- **udemy:course**
|
||||
- **UDNEmbed**: 聯合影音
|
||||
- **UKTVPlay**
|
||||
- **Unistra**
|
||||
- **uol.com.br**
|
||||
- **uplynk**
|
||||
@ -824,6 +865,7 @@
|
||||
- **ViceShow**
|
||||
- **Vidbit**
|
||||
- **Viddler**
|
||||
- **Videa**
|
||||
- **video.google:search**: Google Video search
|
||||
- **video.mit.edu**
|
||||
- **VideoDetective**
|
||||
@ -833,7 +875,7 @@
|
||||
- **videomore:season**
|
||||
- **videomore:video**
|
||||
- **VideoPremium**
|
||||
- **VideoTt**: video.tt - Your True Tube (Currently broken)
|
||||
- **VideoPress**
|
||||
- **videoweed**: VideoWeed
|
||||
- **Vidio**
|
||||
- **vidme**
|
||||
@ -860,20 +902,27 @@
|
||||
- **Vimple**: Vimple - one-click video hosting
|
||||
- **Vine**
|
||||
- **vine:user**
|
||||
- **Viu**
|
||||
- **viu:ott**
|
||||
- **viu:playlist**
|
||||
- **Vivo**: vivo.sx
|
||||
- **vk**: VK
|
||||
- **vk:uservideos**: VK - User's Videos
|
||||
- **vk:wallpost**
|
||||
- **vlive**
|
||||
- **vlive:channel**
|
||||
- **Vodlocker**
|
||||
- **VODPl**
|
||||
- **VODPlatform**
|
||||
- **VoiceRepublic**
|
||||
- **VoxMedia**
|
||||
- **Vporn**
|
||||
- **vpro**: npo.nl and ntr.nl
|
||||
- **Vrak**
|
||||
- **VRT**
|
||||
- **vube**: Vube.com
|
||||
- **VuClip**
|
||||
- **VVVVID**
|
||||
- **VyboryMos**
|
||||
- **Vzaar**
|
||||
- **Walla**
|
||||
|
5
setup.py
5
setup.py
@ -107,8 +107,8 @@ setup(
|
||||
url='https://github.com/rg3/youtube-dl',
|
||||
author='Ricardo Garcia',
|
||||
author_email='ytdl@yt-dl.org',
|
||||
maintainer='Philipp Hagemeister',
|
||||
maintainer_email='phihag@phihag.de',
|
||||
maintainer='Sergey M.',
|
||||
maintainer_email='dstftw@gmail.com',
|
||||
packages=[
|
||||
'youtube_dl',
|
||||
'youtube_dl.extractor', 'youtube_dl.downloader',
|
||||
@ -130,6 +130,7 @@ setup(
|
||||
'Programming Language :: Python :: 3.3',
|
||||
'Programming Language :: Python :: 3.4',
|
||||
'Programming Language :: Python :: 3.5',
|
||||
'Programming Language :: Python :: 3.6',
|
||||
],
|
||||
|
||||
cmdclass={'build_lazy_extractors': build_lazy_extractors},
|
||||
|
@ -1,4 +1,5 @@
|
||||
#!/usr/bin/env python
|
||||
# coding: utf-8
|
||||
|
||||
from __future__ import unicode_literals
|
||||
|
||||
@ -525,6 +526,7 @@ class TestYoutubeDL(unittest.TestCase):
|
||||
'id': '1234',
|
||||
'ext': 'mp4',
|
||||
'width': None,
|
||||
'height': 1080,
|
||||
}
|
||||
|
||||
def fname(templ):
|
||||
@ -534,16 +536,29 @@ class TestYoutubeDL(unittest.TestCase):
|
||||
self.assertEqual(fname('%(id)s-%(width)s.%(ext)s'), '1234-NA.mp4')
|
||||
# Replace missing fields with 'NA'
|
||||
self.assertEqual(fname('%(uploader_date)s-%(id)s.%(ext)s'), 'NA-1234.mp4')
|
||||
self.assertEqual(fname('%(height)d.%(ext)s'), '1080.mp4')
|
||||
self.assertEqual(fname('%(height)6d.%(ext)s'), ' 1080.mp4')
|
||||
self.assertEqual(fname('%(height)-6d.%(ext)s'), '1080 .mp4')
|
||||
self.assertEqual(fname('%(height)06d.%(ext)s'), '001080.mp4')
|
||||
self.assertEqual(fname('%(height) 06d.%(ext)s'), ' 01080.mp4')
|
||||
self.assertEqual(fname('%(height) 06d.%(ext)s'), ' 01080.mp4')
|
||||
self.assertEqual(fname('%(height)0 6d.%(ext)s'), ' 01080.mp4')
|
||||
self.assertEqual(fname('%(height)0 6d.%(ext)s'), ' 01080.mp4')
|
||||
self.assertEqual(fname('%(height) 0 6d.%(ext)s'), ' 01080.mp4')
|
||||
self.assertEqual(fname('%%(height)06d.%(ext)s'), '%(height)06d.mp4')
|
||||
self.assertEqual(fname('%(width)06d.%(ext)s'), 'NA.mp4')
|
||||
self.assertEqual(fname('%(width)06d.%%(ext)s'), 'NA.%(ext)s')
|
||||
self.assertEqual(fname('%%(width)06d.%(ext)s'), '%(width)06d.mp4')
|
||||
|
||||
def test_format_note(self):
|
||||
ydl = YoutubeDL()
|
||||
self.assertEqual(ydl._format_note({}), '')
|
||||
assertRegexpMatches(self, ydl._format_note({
|
||||
'vbr': 10,
|
||||
}), '^\s*10k$')
|
||||
}), r'^\s*10k$')
|
||||
assertRegexpMatches(self, ydl._format_note({
|
||||
'fps': 30,
|
||||
}), '^30fps$')
|
||||
}), r'^30fps$')
|
||||
|
||||
def test_postprocessors(self):
|
||||
filename = 'post-processor-testfile.mp4'
|
||||
@ -606,6 +621,8 @@ class TestYoutubeDL(unittest.TestCase):
|
||||
'duration': 30,
|
||||
'filesize': 10 * 1024,
|
||||
'playlist_id': '42',
|
||||
'uploader': "變態妍字幕版 太妍 тест",
|
||||
'creator': "тест ' 123 ' тест--",
|
||||
}
|
||||
second = {
|
||||
'id': '2',
|
||||
@ -616,6 +633,7 @@ class TestYoutubeDL(unittest.TestCase):
|
||||
'description': 'foo',
|
||||
'filesize': 5 * 1024,
|
||||
'playlist_id': '43',
|
||||
'uploader': "тест 123",
|
||||
}
|
||||
videos = [first, second]
|
||||
|
||||
@ -656,6 +674,26 @@ class TestYoutubeDL(unittest.TestCase):
|
||||
res = get_videos(f)
|
||||
self.assertEqual(res, ['1'])
|
||||
|
||||
f = match_filter_func('uploader = "變態妍字幕版 太妍 тест"')
|
||||
res = get_videos(f)
|
||||
self.assertEqual(res, ['1'])
|
||||
|
||||
f = match_filter_func('uploader != "變態妍字幕版 太妍 тест"')
|
||||
res = get_videos(f)
|
||||
self.assertEqual(res, ['2'])
|
||||
|
||||
f = match_filter_func('creator = "тест \' 123 \' тест--"')
|
||||
res = get_videos(f)
|
||||
self.assertEqual(res, ['1'])
|
||||
|
||||
f = match_filter_func("creator = 'тест \\' 123 \\' тест--'")
|
||||
res = get_videos(f)
|
||||
self.assertEqual(res, ['1'])
|
||||
|
||||
f = match_filter_func(r"creator = 'тест \' 123 \' тест--' & duration > 30")
|
||||
res = get_videos(f)
|
||||
self.assertEqual(res, [])
|
||||
|
||||
def test_playlist_items_selection(self):
|
||||
entries = [{
|
||||
'id': compat_str(i),
|
||||
|
@ -8,7 +8,7 @@ import sys
|
||||
import unittest
|
||||
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
|
||||
|
||||
from youtube_dl.aes import aes_decrypt, aes_encrypt, aes_cbc_decrypt, aes_decrypt_text
|
||||
from youtube_dl.aes import aes_decrypt, aes_encrypt, aes_cbc_decrypt, aes_cbc_encrypt, aes_decrypt_text
|
||||
from youtube_dl.utils import bytes_to_intlist, intlist_to_bytes
|
||||
import base64
|
||||
|
||||
@ -34,6 +34,13 @@ class TestAES(unittest.TestCase):
|
||||
decrypted = intlist_to_bytes(aes_cbc_decrypt(data, self.key, self.iv))
|
||||
self.assertEqual(decrypted.rstrip(b'\x08'), self.secret_msg)
|
||||
|
||||
def test_cbc_encrypt(self):
|
||||
data = bytes_to_intlist(self.secret_msg)
|
||||
encrypted = intlist_to_bytes(aes_cbc_encrypt(data, self.key, self.iv))
|
||||
self.assertEqual(
|
||||
encrypted,
|
||||
b"\x97\x92+\xe5\x0b\xc3\x18\x91ky9m&\xb3\xb5@\xe6'\xc2\x96.\xc8u\x88\xab9-[\x9e|\xf1\xcd")
|
||||
|
||||
def test_decrypt_text(self):
|
||||
password = intlist_to_bytes(self.key).decode('utf-8')
|
||||
encrypted = base64.b64encode(
|
||||
|
@ -65,6 +65,10 @@ defs = gettestcases()
|
||||
|
||||
|
||||
class TestDownload(unittest.TestCase):
|
||||
# Parallel testing in nosetests. See
|
||||
# http://nose.readthedocs.org/en/latest/doc_tests/test_multiprocess/multiprocess.html
|
||||
_multiprocess_shared_ = True
|
||||
|
||||
maxDiff = None
|
||||
|
||||
def setUp(self):
|
||||
@ -73,7 +77,7 @@ class TestDownload(unittest.TestCase):
|
||||
# Dynamically generate tests
|
||||
|
||||
|
||||
def generator(test_case):
|
||||
def generator(test_case, tname):
|
||||
|
||||
def test_template(self):
|
||||
ie = youtube_dl.extractor.get_info_extractor(test_case['name'])
|
||||
@ -102,6 +106,7 @@ def generator(test_case):
|
||||
return
|
||||
|
||||
params = get_params(test_case.get('params', {}))
|
||||
params['outtmpl'] = tname + '_' + params['outtmpl']
|
||||
if is_playlist and 'playlist' not in test_case:
|
||||
params.setdefault('extract_flat', 'in_playlist')
|
||||
params.setdefault('skip_download', True)
|
||||
@ -146,7 +151,7 @@ def generator(test_case):
|
||||
raise
|
||||
|
||||
if try_num == RETRIES:
|
||||
report_warning('Failed due to network errors, skipping...')
|
||||
report_warning('%s failed due to network errors, skipping...' % tname)
|
||||
return
|
||||
|
||||
print('Retrying: {0} failed tries\n\n##########\n\n'.format(try_num))
|
||||
@ -221,12 +226,12 @@ def generator(test_case):
|
||||
|
||||
# And add them to TestDownload
|
||||
for n, test_case in enumerate(defs):
|
||||
test_method = generator(test_case)
|
||||
tname = 'test_' + str(test_case['name'])
|
||||
i = 1
|
||||
while hasattr(TestDownload, tname):
|
||||
tname = 'test_%s_%d' % (test_case['name'], i)
|
||||
i += 1
|
||||
test_method = generator(test_case, tname)
|
||||
test_method.__name__ = str(tname)
|
||||
setattr(TestDownload, test_method.__name__, test_method)
|
||||
del test_method
|
||||
|
@ -34,6 +34,9 @@ from youtube_dl.utils import (
|
||||
find_xpath_attr,
|
||||
fix_xml_ampersands,
|
||||
get_element_by_class,
|
||||
get_element_by_attribute,
|
||||
get_elements_by_class,
|
||||
get_elements_by_attribute,
|
||||
InAdvancePagedList,
|
||||
intlist_to_bytes,
|
||||
is_html,
|
||||
@ -49,6 +52,7 @@ from youtube_dl.utils import (
|
||||
parse_filesize,
|
||||
parse_count,
|
||||
parse_iso8601,
|
||||
pkcs1pad,
|
||||
read_batch_urls,
|
||||
sanitize_filename,
|
||||
sanitize_path,
|
||||
@ -295,6 +299,9 @@ class TestUtil(unittest.TestCase):
|
||||
self.assertEqual(unified_strdate('27.02.2016 17:30'), '20160227')
|
||||
self.assertEqual(unified_strdate('UNKNOWN DATE FORMAT'), None)
|
||||
self.assertEqual(unified_strdate('Feb 7, 2016 at 6:35 pm'), '20160207')
|
||||
self.assertEqual(unified_strdate('July 15th, 2013'), '20130715')
|
||||
self.assertEqual(unified_strdate('September 1st, 2013'), '20130901')
|
||||
self.assertEqual(unified_strdate('Sep 2nd, 2013'), '20130902')
|
||||
|
||||
def test_unified_timestamps(self):
|
||||
self.assertEqual(unified_timestamp('December 21, 2010'), 1292889600)
|
||||
@ -448,16 +455,23 @@ class TestUtil(unittest.TestCase):
|
||||
|
||||
def test_urljoin(self):
|
||||
self.assertEqual(urljoin('http://foo.de/', '/a/b/c.txt'), 'http://foo.de/a/b/c.txt')
|
||||
self.assertEqual(urljoin(b'http://foo.de/', '/a/b/c.txt'), 'http://foo.de/a/b/c.txt')
|
||||
self.assertEqual(urljoin('http://foo.de/', b'/a/b/c.txt'), 'http://foo.de/a/b/c.txt')
|
||||
self.assertEqual(urljoin(b'http://foo.de/', b'/a/b/c.txt'), 'http://foo.de/a/b/c.txt')
|
||||
self.assertEqual(urljoin('//foo.de/', '/a/b/c.txt'), '//foo.de/a/b/c.txt')
|
||||
self.assertEqual(urljoin('http://foo.de/', 'a/b/c.txt'), 'http://foo.de/a/b/c.txt')
|
||||
self.assertEqual(urljoin('http://foo.de', '/a/b/c.txt'), 'http://foo.de/a/b/c.txt')
|
||||
self.assertEqual(urljoin('http://foo.de', 'a/b/c.txt'), 'http://foo.de/a/b/c.txt')
|
||||
self.assertEqual(urljoin('http://foo.de/', 'http://foo.de/a/b/c.txt'), 'http://foo.de/a/b/c.txt')
|
||||
self.assertEqual(urljoin('http://foo.de/', '//foo.de/a/b/c.txt'), '//foo.de/a/b/c.txt')
|
||||
self.assertEqual(urljoin(None, 'http://foo.de/a/b/c.txt'), 'http://foo.de/a/b/c.txt')
|
||||
self.assertEqual(urljoin(None, '//foo.de/a/b/c.txt'), '//foo.de/a/b/c.txt')
|
||||
self.assertEqual(urljoin('', 'http://foo.de/a/b/c.txt'), 'http://foo.de/a/b/c.txt')
|
||||
self.assertEqual(urljoin(['foobar'], 'http://foo.de/a/b/c.txt'), 'http://foo.de/a/b/c.txt')
|
||||
self.assertEqual(urljoin('http://foo.de/', None), None)
|
||||
self.assertEqual(urljoin('http://foo.de/', ''), None)
|
||||
self.assertEqual(urljoin('http://foo.de/', ['foobar']), None)
|
||||
self.assertEqual(urljoin('http://foo.de/a/b/c.txt', '.././../d.txt'), 'http://foo.de/d.txt')
|
||||
|
||||
def test_parse_age_limit(self):
|
||||
self.assertEqual(parse_age_limit(None), None)
|
||||
@ -503,6 +517,7 @@ class TestUtil(unittest.TestCase):
|
||||
self.assertEqual(parse_duration('1 hour 3 minutes'), 3780)
|
||||
self.assertEqual(parse_duration('87 Min.'), 5220)
|
||||
self.assertEqual(parse_duration('PT1H0.040S'), 3600.04)
|
||||
self.assertEqual(parse_duration('PT00H03M30SZ'), 210)
|
||||
|
||||
def test_fix_xml_ampersands(self):
|
||||
self.assertEqual(
|
||||
@ -777,12 +792,27 @@ class TestUtil(unittest.TestCase):
|
||||
on = js_to_json('["abc", "def",]')
|
||||
self.assertEqual(json.loads(on), ['abc', 'def'])
|
||||
|
||||
on = js_to_json('[/*comment\n*/"abc"/*comment\n*/,/*comment\n*/"def",/*comment\n*/]')
|
||||
self.assertEqual(json.loads(on), ['abc', 'def'])
|
||||
|
||||
on = js_to_json('[//comment\n"abc" //comment\n,//comment\n"def",//comment\n]')
|
||||
self.assertEqual(json.loads(on), ['abc', 'def'])
|
||||
|
||||
on = js_to_json('{"abc": "def",}')
|
||||
self.assertEqual(json.loads(on), {'abc': 'def'})
|
||||
|
||||
on = js_to_json('{/*comment\n*/"abc"/*comment\n*/:/*comment\n*/"def"/*comment\n*/,/*comment\n*/}')
|
||||
self.assertEqual(json.loads(on), {'abc': 'def'})
|
||||
|
||||
on = js_to_json('{ 0: /* " \n */ ",]" , }')
|
||||
self.assertEqual(json.loads(on), {'0': ',]'})
|
||||
|
||||
on = js_to_json('{ /*comment\n*/0/*comment\n*/: /* " \n */ ",]" , }')
|
||||
self.assertEqual(json.loads(on), {'0': ',]'})
|
||||
|
||||
on = js_to_json('{ 0: // comment\n1 }')
|
||||
self.assertEqual(json.loads(on), {'0': 1})
|
||||
|
||||
on = js_to_json(r'["<p>x<\/p>"]')
|
||||
self.assertEqual(json.loads(on), ['<p>x</p>'])
|
||||
|
||||
@ -792,15 +822,27 @@ class TestUtil(unittest.TestCase):
|
||||
on = js_to_json("['a\\\nb']")
|
||||
self.assertEqual(json.loads(on), ['ab'])
|
||||
|
||||
on = js_to_json("/*comment\n*/[/*comment\n*/'a\\\nb'/*comment\n*/]/*comment\n*/")
|
||||
self.assertEqual(json.loads(on), ['ab'])
|
||||
|
||||
on = js_to_json('{0xff:0xff}')
|
||||
self.assertEqual(json.loads(on), {'255': 255})
|
||||
|
||||
on = js_to_json('{/*comment\n*/0xff/*comment\n*/:/*comment\n*/0xff/*comment\n*/}')
|
||||
self.assertEqual(json.loads(on), {'255': 255})
|
||||
|
||||
on = js_to_json('{077:077}')
|
||||
self.assertEqual(json.loads(on), {'63': 63})
|
||||
|
||||
on = js_to_json('{/*comment\n*/077/*comment\n*/:/*comment\n*/077/*comment\n*/}')
|
||||
self.assertEqual(json.loads(on), {'63': 63})
|
||||
|
||||
on = js_to_json('{42:42}')
|
||||
self.assertEqual(json.loads(on), {'42': 42})
|
||||
|
||||
on = js_to_json('{/*comment\n*/42/*comment\n*/:/*comment\n*/42/*comment\n*/}')
|
||||
self.assertEqual(json.loads(on), {'42': 42})
|
||||
|
||||
def test_extract_attributes(self):
|
||||
self.assertEqual(extract_attributes('<e x="y">'), {'x': 'y'})
|
||||
self.assertEqual(extract_attributes("<e x='y'>"), {'x': 'y'})
|
||||
@ -1066,6 +1108,14 @@ The first line
|
||||
ohdave_rsa_encrypt(b'aa111222', e, N),
|
||||
'726664bd9a23fd0c70f9f1b84aab5e3905ce1e45a584e9cbcf9bcc7510338fc1986d6c599ff990d923aa43c51c0d9013cd572e13bc58f4ae48f2ed8c0b0ba881')
|
||||
|
||||
def test_pkcs1pad(self):
|
||||
data = [1, 2, 3]
|
||||
padded_data = pkcs1pad(data, 32)
|
||||
self.assertEqual(padded_data[:2], [0, 2])
|
||||
self.assertEqual(padded_data[28:], [0, 1, 2, 3])
|
||||
|
||||
self.assertRaises(ValueError, pkcs1pad, data, 8)
|
||||
|
||||
def test_encode_base_n(self):
|
||||
self.assertEqual(encode_base_n(0, 30), '0')
|
||||
self.assertEqual(encode_base_n(80, 30), '2k')
|
||||
@ -1089,6 +1139,32 @@ The first line
|
||||
self.assertEqual(get_element_by_class('foo', html), 'nice')
|
||||
self.assertEqual(get_element_by_class('no-such-class', html), None)
|
||||
|
||||
def test_get_element_by_attribute(self):
|
||||
html = '''
|
||||
<span class="foo bar">nice</span>
|
||||
'''
|
||||
|
||||
self.assertEqual(get_element_by_attribute('class', 'foo bar', html), 'nice')
|
||||
self.assertEqual(get_element_by_attribute('class', 'foo', html), None)
|
||||
self.assertEqual(get_element_by_attribute('class', 'no-such-foo', html), None)
|
||||
|
||||
def test_get_elements_by_class(self):
|
||||
html = '''
|
||||
<span class="foo bar">nice</span><span class="foo bar">also nice</span>
|
||||
'''
|
||||
|
||||
self.assertEqual(get_elements_by_class('foo', html), ['nice', 'also nice'])
|
||||
self.assertEqual(get_elements_by_class('no-such-class', html), [])
|
||||
|
||||
def test_get_elements_by_attribute(self):
|
||||
html = '''
|
||||
<span class="foo bar">nice</span><span class="foo bar">also nice</span>
|
||||
'''
|
||||
|
||||
self.assertEqual(get_elements_by_attribute('class', 'foo bar', html), ['nice', 'also nice'])
|
||||
self.assertEqual(get_elements_by_attribute('class', 'foo', html), [])
|
||||
self.assertEqual(get_elements_by_attribute('class', 'no-such-foo', html), [])
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
unittest.main()
|
||||
|
@ -24,6 +24,7 @@ import sys
|
||||
import time
|
||||
import tokenize
|
||||
import traceback
|
||||
import random
|
||||
|
||||
from .compat import (
|
||||
compat_basestring,
|
||||
@ -32,6 +33,7 @@ from .compat import (
|
||||
compat_get_terminal_size,
|
||||
compat_http_client,
|
||||
compat_kwargs,
|
||||
compat_numeric_types,
|
||||
compat_os_name,
|
||||
compat_str,
|
||||
compat_tokenize_tokenize,
|
||||
@ -55,6 +57,8 @@ from .utils import (
|
||||
ExtractorError,
|
||||
format_bytes,
|
||||
formatSeconds,
|
||||
GeoRestrictedError,
|
||||
ISO3166Utils,
|
||||
locked_file,
|
||||
make_HTTPS_handler,
|
||||
MaxDownloadsReached,
|
||||
@ -159,6 +163,7 @@ class YoutubeDL(object):
|
||||
playlistend: Playlist item to end at.
|
||||
playlist_items: Specific indices of playlist to download.
|
||||
playlistreverse: Download playlist items in reverse order.
|
||||
playlistrandom: Download playlist items in random order.
|
||||
matchtitle: Download only matching titles.
|
||||
rejecttitle: Reject downloads for matching titles.
|
||||
logger: Log messages to a logging.Logger instance.
|
||||
@ -270,6 +275,12 @@ class YoutubeDL(object):
|
||||
If it returns None, the video is downloaded.
|
||||
match_filter_func in utils.py is one example for this.
|
||||
no_color: Do not emit color codes in output.
|
||||
geo_bypass: Bypass geographic restriction via faking X-Forwarded-For
|
||||
HTTP header (experimental)
|
||||
geo_bypass_country:
|
||||
Two-letter ISO 3166-2 country code that will be used for
|
||||
explicit geographic restriction bypassing via faking
|
||||
X-Forwarded-For HTTP header (experimental)
|
||||
|
||||
The following options determine which downloader is picked:
|
||||
external_downloader: Executable of the external downloader to call.
|
||||
@ -317,11 +328,21 @@ class YoutubeDL(object):
|
||||
self.params.update(params)
|
||||
self.cache = Cache(self)
|
||||
|
||||
if self.params.get('cn_verification_proxy') is not None:
|
||||
self.report_warning('--cn-verification-proxy is deprecated. Use --geo-verification-proxy instead.')
|
||||
def check_deprecated(param, option, suggestion):
|
||||
if self.params.get(param) is not None:
|
||||
self.report_warning(
|
||||
'%s is deprecated. Use %s instead.' % (option, suggestion))
|
||||
return True
|
||||
return False
|
||||
|
||||
if check_deprecated('cn_verification_proxy', '--cn-verification-proxy', '--geo-verification-proxy'):
|
||||
if self.params.get('geo_verification_proxy') is None:
|
||||
self.params['geo_verification_proxy'] = self.params['cn_verification_proxy']
|
||||
|
||||
check_deprecated('autonumber_size', '--autonumber-size', 'output template with %(autonumber)0Nd, where N in the number of digits')
|
||||
check_deprecated('autonumber', '--auto-number', '-o "%(autonumber)s-%(title)s.%(ext)s"')
|
||||
check_deprecated('usetitle', '--title', '-o "%(title)s-%(id)s.%(ext)s"')
|
||||
|
||||
if params.get('bidi_workaround', False):
|
||||
try:
|
||||
import pty
|
||||
@ -583,10 +604,7 @@ class YoutubeDL(object):
|
||||
autonumber_size = self.params.get('autonumber_size')
|
||||
if autonumber_size is None:
|
||||
autonumber_size = 5
|
||||
autonumber_templ = '%0' + str(autonumber_size) + 'd'
|
||||
template_dict['autonumber'] = autonumber_templ % self._num_downloads
|
||||
if template_dict.get('playlist_index') is not None:
|
||||
template_dict['playlist_index'] = '%0*d' % (len(str(template_dict['n_entries'])), template_dict['playlist_index'])
|
||||
template_dict['autonumber'] = self.params.get('autonumber_start', 1) - 1 + self._num_downloads
|
||||
if template_dict.get('resolution') is None:
|
||||
if template_dict.get('width') and template_dict.get('height'):
|
||||
template_dict['resolution'] = '%dx%d' % (template_dict['width'], template_dict['height'])
|
||||
@ -598,13 +616,62 @@ class YoutubeDL(object):
|
||||
sanitize = lambda k, v: sanitize_filename(
|
||||
compat_str(v),
|
||||
restricted=self.params.get('restrictfilenames'),
|
||||
is_id=(k == 'id'))
|
||||
template_dict = dict((k, sanitize(k, v))
|
||||
is_id=(k == 'id' or k.endswith('_id')))
|
||||
template_dict = dict((k, v if isinstance(v, compat_numeric_types) else sanitize(k, v))
|
||||
for k, v in template_dict.items()
|
||||
if v is not None and not isinstance(v, (list, tuple, dict)))
|
||||
template_dict = collections.defaultdict(lambda: 'NA', template_dict)
|
||||
|
||||
outtmpl = self.params.get('outtmpl', DEFAULT_OUTTMPL)
|
||||
|
||||
# For fields playlist_index and autonumber convert all occurrences
|
||||
# of %(field)s to %(field)0Nd for backward compatibility
|
||||
field_size_compat_map = {
|
||||
'playlist_index': len(str(template_dict['n_entries'])),
|
||||
'autonumber': autonumber_size,
|
||||
}
|
||||
FIELD_SIZE_COMPAT_RE = r'(?<!%)%\((?P<field>autonumber|playlist_index)\)s'
|
||||
mobj = re.search(FIELD_SIZE_COMPAT_RE, outtmpl)
|
||||
if mobj:
|
||||
outtmpl = re.sub(
|
||||
FIELD_SIZE_COMPAT_RE,
|
||||
r'%%(\1)0%dd' % field_size_compat_map[mobj.group('field')],
|
||||
outtmpl)
|
||||
|
||||
NUMERIC_FIELDS = set((
|
||||
'width', 'height', 'tbr', 'abr', 'asr', 'vbr', 'fps', 'filesize', 'filesize_approx',
|
||||
'upload_year', 'upload_month', 'upload_day',
|
||||
'duration', 'view_count', 'like_count', 'dislike_count', 'repost_count',
|
||||
'average_rating', 'comment_count', 'age_limit',
|
||||
'start_time', 'end_time',
|
||||
'chapter_number', 'season_number', 'episode_number',
|
||||
'track_number', 'disc_number', 'release_year',
|
||||
'playlist_index',
|
||||
))
|
||||
|
||||
# Missing numeric fields used together with integer presentation types
|
||||
# in format specification will break the argument substitution since
|
||||
# string 'NA' is returned for missing fields. We will patch output
|
||||
# template for missing fields to meet string presentation type.
|
||||
for numeric_field in NUMERIC_FIELDS:
|
||||
if numeric_field not in template_dict:
|
||||
# As of [1] format syntax is:
|
||||
# %[mapping_key][conversion_flags][minimum_width][.precision][length_modifier]type
|
||||
# 1. https://docs.python.org/2/library/stdtypes.html#string-formatting
|
||||
FORMAT_RE = r'''(?x)
|
||||
(?<!%)
|
||||
%
|
||||
\({0}\) # mapping key
|
||||
(?:[#0\-+ ]+)? # conversion flags (optional)
|
||||
(?:\d+)? # minimum field width (optional)
|
||||
(?:\.\d+)? # precision (optional)
|
||||
[hlL]? # length modifier (optional)
|
||||
[diouxXeEfFgGcrs%] # conversion type
|
||||
'''
|
||||
outtmpl = re.sub(
|
||||
FORMAT_RE.format(numeric_field),
|
||||
r'%({0})s'.format(numeric_field), outtmpl)
|
||||
|
||||
tmpl = compat_expanduser(outtmpl)
|
||||
filename = tmpl % template_dict
|
||||
# Temporary fix for #4787
|
||||
@ -705,6 +772,14 @@ class YoutubeDL(object):
|
||||
return self.process_ie_result(ie_result, download, extra_info)
|
||||
else:
|
||||
return ie_result
|
||||
except GeoRestrictedError as e:
|
||||
msg = e.msg
|
||||
if e.countries:
|
||||
msg += '\nThis video is available in %s.' % ', '.join(
|
||||
map(ISO3166Utils.short2full, e.countries))
|
||||
msg += '\nYou might want to use a VPN or a proxy server (with --proxy) to workaround.'
|
||||
self.report_error(msg)
|
||||
break
|
||||
except ExtractorError as e: # An error we somewhat expected
|
||||
self.report_error(compat_str(e), e.format_traceback())
|
||||
break
|
||||
@ -842,8 +917,17 @@ class YoutubeDL(object):
|
||||
if self.params.get('playlistreverse', False):
|
||||
entries = entries[::-1]
|
||||
|
||||
if self.params.get('playlistrandom', False):
|
||||
random.shuffle(entries)
|
||||
|
||||
x_forwarded_for = ie_result.get('__x_forwarded_for_ip')
|
||||
|
||||
for i, entry in enumerate(entries, 1):
|
||||
self.to_screen('[download] Downloading video %s of %s' % (i, n_entries))
|
||||
# This __x_forwarded_for_ip thing is a bit ugly but requires
|
||||
# minimal changes
|
||||
if x_forwarded_for:
|
||||
entry['__x_forwarded_for_ip'] = x_forwarded_for
|
||||
extra = {
|
||||
'n_entries': n_entries,
|
||||
'playlist': playlist,
|
||||
@ -1228,6 +1312,11 @@ class YoutubeDL(object):
|
||||
if cookies:
|
||||
res['Cookie'] = cookies
|
||||
|
||||
if 'X-Forwarded-For' not in res:
|
||||
x_forwarded_for_ip = info_dict.get('__x_forwarded_for_ip')
|
||||
if x_forwarded_for_ip:
|
||||
res['X-Forwarded-For'] = x_forwarded_for_ip
|
||||
|
||||
return res
|
||||
|
||||
def _calc_cookies(self, info_dict):
|
||||
@ -1339,7 +1428,7 @@ class YoutubeDL(object):
|
||||
format['format_id'] = compat_str(i)
|
||||
else:
|
||||
# Sanitize format_id from characters used in format selector expression
|
||||
format['format_id'] = re.sub('[\s,/+\[\]()]', '_', format['format_id'])
|
||||
format['format_id'] = re.sub(r'[\s,/+\[\]()]', '_', format['format_id'])
|
||||
format_id = format['format_id']
|
||||
if format_id not in formats_dict:
|
||||
formats_dict[format_id] = []
|
||||
@ -1363,13 +1452,16 @@ class YoutubeDL(object):
|
||||
format['ext'] = determine_ext(format['url']).lower()
|
||||
# Automatically determine protocol if missing (useful for format
|
||||
# selection purposes)
|
||||
if 'protocol' not in format:
|
||||
if format.get('protocol') is None:
|
||||
format['protocol'] = determine_protocol(format)
|
||||
# Add HTTP headers, so that external programs can use them from the
|
||||
# json output
|
||||
full_format_info = info_dict.copy()
|
||||
full_format_info.update(format)
|
||||
format['http_headers'] = self._calc_headers(full_format_info)
|
||||
# Remove private housekeeping stuff
|
||||
if '__x_forwarded_for_ip' in info_dict:
|
||||
del info_dict['__x_forwarded_for_ip']
|
||||
|
||||
# TODO Central sorting goes here
|
||||
|
||||
|
@ -133,6 +133,12 @@ def _real_main(argv=None):
|
||||
parser.error('TV Provider account username missing\n')
|
||||
if opts.outtmpl is not None and (opts.usetitle or opts.autonumber or opts.useid):
|
||||
parser.error('using output template conflicts with using title, video ID or auto number')
|
||||
if opts.autonumber_size is not None:
|
||||
if opts.autonumber_size <= 0:
|
||||
parser.error('auto number size must be positive')
|
||||
if opts.autonumber_start is not None:
|
||||
if opts.autonumber_start < 0:
|
||||
parser.error('auto number start must be positive or 0')
|
||||
if opts.usetitle and opts.useid:
|
||||
parser.error('using title conflicts with using video ID')
|
||||
if opts.username is not None and opts.password is None:
|
||||
@ -236,14 +242,11 @@ def _real_main(argv=None):
|
||||
|
||||
# PostProcessors
|
||||
postprocessors = []
|
||||
# Add the metadata pp first, the other pps will copy it
|
||||
if opts.metafromtitle:
|
||||
postprocessors.append({
|
||||
'key': 'MetadataFromTitle',
|
||||
'titleformat': opts.metafromtitle
|
||||
})
|
||||
if opts.addmetadata:
|
||||
postprocessors.append({'key': 'FFmpegMetadata'})
|
||||
if opts.extractaudio:
|
||||
postprocessors.append({
|
||||
'key': 'FFmpegExtractAudio',
|
||||
@ -273,6 +276,11 @@ def _real_main(argv=None):
|
||||
})
|
||||
if not already_have_thumbnail:
|
||||
opts.writethumbnail = True
|
||||
# FFmpegMetadataPP should be run after FFmpegVideoConvertorPP and
|
||||
# FFmpegExtractAudioPP as containers before conversion may not support
|
||||
# metadata (3gp, webm, etc.)
|
||||
if opts.addmetadata:
|
||||
postprocessors.append({'key': 'FFmpegMetadata'})
|
||||
# XAttrMetadataPP should be run after post-processors that may change file
|
||||
# contents
|
||||
if opts.xattrs:
|
||||
@ -321,6 +329,7 @@ def _real_main(argv=None):
|
||||
'listformats': opts.listformats,
|
||||
'outtmpl': outtmpl,
|
||||
'autonumber_size': opts.autonumber_size,
|
||||
'autonumber_start': opts.autonumber_start,
|
||||
'restrictfilenames': opts.restrictfilenames,
|
||||
'ignoreerrors': opts.ignoreerrors,
|
||||
'force_generic_extractor': opts.force_generic_extractor,
|
||||
@ -337,6 +346,7 @@ def _real_main(argv=None):
|
||||
'playliststart': opts.playliststart,
|
||||
'playlistend': opts.playlistend,
|
||||
'playlistreverse': opts.playlist_reverse,
|
||||
'playlistrandom': opts.playlist_random,
|
||||
'noplaylist': opts.noplaylist,
|
||||
'logtostderr': opts.outtmpl == '-',
|
||||
'consoletitle': opts.consoletitle,
|
||||
@ -405,7 +415,12 @@ def _real_main(argv=None):
|
||||
'postprocessor_args': postprocessor_args,
|
||||
'cn_verification_proxy': opts.cn_verification_proxy,
|
||||
'geo_verification_proxy': opts.geo_verification_proxy,
|
||||
|
||||
'config_location': opts.config_location,
|
||||
'geo_bypass': opts.geo_bypass,
|
||||
'geo_bypass_country': opts.geo_bypass_country,
|
||||
# just for deprecation check
|
||||
'autonumber': opts.autonumber if opts.autonumber is True else None,
|
||||
'usetitle': opts.usetitle if opts.usetitle is True else None,
|
||||
}
|
||||
|
||||
with YoutubeDL(ydl_opts) as ydl:
|
||||
|
@ -60,6 +60,34 @@ def aes_cbc_decrypt(data, key, iv):
|
||||
return decrypted_data
|
||||
|
||||
|
||||
def aes_cbc_encrypt(data, key, iv):
|
||||
"""
|
||||
Encrypt with aes in CBC mode. Using PKCS#7 padding
|
||||
|
||||
@param {int[]} data cleartext
|
||||
@param {int[]} key 16/24/32-Byte cipher key
|
||||
@param {int[]} iv 16-Byte IV
|
||||
@returns {int[]} encrypted data
|
||||
"""
|
||||
expanded_key = key_expansion(key)
|
||||
block_count = int(ceil(float(len(data)) / BLOCK_SIZE_BYTES))
|
||||
|
||||
encrypted_data = []
|
||||
previous_cipher_block = iv
|
||||
for i in range(block_count):
|
||||
block = data[i * BLOCK_SIZE_BYTES: (i + 1) * BLOCK_SIZE_BYTES]
|
||||
remaining_length = BLOCK_SIZE_BYTES - len(block)
|
||||
block += [remaining_length] * remaining_length
|
||||
mixed_block = xor(block, previous_cipher_block)
|
||||
|
||||
encrypted_block = aes_encrypt(mixed_block, expanded_key)
|
||||
encrypted_data += encrypted_block
|
||||
|
||||
previous_cipher_block = encrypted_block
|
||||
|
||||
return encrypted_data
|
||||
|
||||
|
||||
def key_expansion(data):
|
||||
"""
|
||||
Generate key schedule
|
||||
|
@ -2344,7 +2344,7 @@ try:
|
||||
from urllib.parse import unquote_plus as compat_urllib_parse_unquote_plus
|
||||
except ImportError: # Python 2
|
||||
_asciire = (compat_urllib_parse._asciire if hasattr(compat_urllib_parse, '_asciire')
|
||||
else re.compile('([\x00-\x7f]+)'))
|
||||
else re.compile(r'([\x00-\x7f]+)'))
|
||||
|
||||
# HACK: The following are the correct unquote_to_bytes, unquote and unquote_plus
|
||||
# implementations from cpython 3.4.3's stdlib. Python 2's version
|
||||
@ -2529,6 +2529,24 @@ else:
|
||||
el.text = el.text.decode('utf-8')
|
||||
return doc
|
||||
|
||||
if hasattr(etree, 'register_namespace'):
|
||||
compat_etree_register_namespace = etree.register_namespace
|
||||
else:
|
||||
def compat_etree_register_namespace(prefix, uri):
|
||||
"""Register a namespace prefix.
|
||||
The registry is global, and any existing mapping for either the
|
||||
given prefix or the namespace URI will be removed.
|
||||
*prefix* is the namespace prefix, *uri* is a namespace uri. Tags and
|
||||
attributes in this namespace will be serialized with prefix if possible.
|
||||
ValueError is raised if prefix is reserved or is invalid.
|
||||
"""
|
||||
if re.match(r"ns\d+$", prefix):
|
||||
raise ValueError("Prefix format reserved for internal use")
|
||||
for k, v in list(etree._namespace_map.items()):
|
||||
if k == uri or v == prefix:
|
||||
del etree._namespace_map[k]
|
||||
etree._namespace_map[uri] = prefix
|
||||
|
||||
if sys.version_info < (2, 7):
|
||||
# Here comes the crazy part: In 2.6, if the xpath is a unicode,
|
||||
# .//node does not match if a node is a direct child of . !
|
||||
@ -2742,6 +2760,12 @@ else:
|
||||
compat_kwargs = lambda kwargs: kwargs
|
||||
|
||||
|
||||
try:
|
||||
compat_numeric_types = (int, float, long, complex)
|
||||
except NameError: # Python 3
|
||||
compat_numeric_types = (int, float, complex)
|
||||
|
||||
|
||||
if sys.version_info < (2, 7):
|
||||
def compat_socket_create_connection(address, timeout, source_address=None):
|
||||
host, port = address
|
||||
@ -2865,6 +2889,7 @@ __all__ = [
|
||||
'compat_cookiejar',
|
||||
'compat_cookies',
|
||||
'compat_etree_fromstring',
|
||||
'compat_etree_register_namespace',
|
||||
'compat_expanduser',
|
||||
'compat_get_terminal_size',
|
||||
'compat_getenv',
|
||||
@ -2876,6 +2901,7 @@ __all__ = [
|
||||
'compat_input',
|
||||
'compat_itertools_count',
|
||||
'compat_kwargs',
|
||||
'compat_numeric_types',
|
||||
'compat_ord',
|
||||
'compat_os_name',
|
||||
'compat_parse_qs',
|
||||
|
@ -347,7 +347,10 @@ class FileDownloader(object):
|
||||
if min_sleep_interval:
|
||||
max_sleep_interval = self.params.get('max_sleep_interval', min_sleep_interval)
|
||||
sleep_interval = random.uniform(min_sleep_interval, max_sleep_interval)
|
||||
self.to_screen('[download] Sleeping %s seconds...' % sleep_interval)
|
||||
self.to_screen(
|
||||
'[download] Sleeping %s seconds...' % (
|
||||
int(sleep_interval) if sleep_interval.is_integer()
|
||||
else '%.2f' % sleep_interval))
|
||||
time.sleep(sleep_interval)
|
||||
|
||||
return self.real_download(filename, info_dict)
|
||||
|
@ -43,7 +43,10 @@ class DashSegmentsFD(FragmentFD):
|
||||
count = 0
|
||||
while count <= fragment_retries:
|
||||
try:
|
||||
success = ctx['dl'].download(target_filename, {'url': segment_url})
|
||||
success = ctx['dl'].download(target_filename, {
|
||||
'url': segment_url,
|
||||
'http_headers': info_dict.get('http_headers'),
|
||||
})
|
||||
if not success:
|
||||
return False
|
||||
down, target_sanitized = sanitize_open(target_filename, 'rb')
|
||||
|
@ -6,7 +6,10 @@ import sys
|
||||
import re
|
||||
|
||||
from .common import FileDownloader
|
||||
from ..compat import compat_setenv
|
||||
from ..compat import (
|
||||
compat_setenv,
|
||||
compat_str,
|
||||
)
|
||||
from ..postprocessor.ffmpeg import FFmpegPostProcessor, EXT_TO_OUT_FORMATS
|
||||
from ..utils import (
|
||||
cli_option,
|
||||
@ -17,6 +20,7 @@ from ..utils import (
|
||||
encodeArgument,
|
||||
handle_youtubedl_headers,
|
||||
check_executable,
|
||||
is_outdated_version,
|
||||
)
|
||||
|
||||
|
||||
@ -198,6 +202,15 @@ class FFmpegFD(ExternalFD):
|
||||
|
||||
args = [ffpp.executable, '-y']
|
||||
|
||||
seekable = info_dict.get('_seekable')
|
||||
if seekable is not None:
|
||||
# setting -seekable prevents ffmpeg from guessing if the server
|
||||
# supports seeking(by adding the header `Range: bytes=0-`), which
|
||||
# can cause problems in some cases
|
||||
# https://github.com/rg3/youtube-dl/issues/11800#issuecomment-275037127
|
||||
# http://trac.ffmpeg.org/ticket/6125#comment:10
|
||||
args += ['-seekable', '1' if seekable else '0']
|
||||
|
||||
args += self._configuration_args()
|
||||
|
||||
# start_time = info_dict.get('start_time') or 0
|
||||
@ -260,11 +273,17 @@ class FFmpegFD(ExternalFD):
|
||||
args += ['-rtmp_live', 'live']
|
||||
|
||||
args += ['-i', url, '-c', 'copy']
|
||||
|
||||
if self.params.get('test', False):
|
||||
args += ['-fs', compat_str(self._TEST_FILE_SIZE)]
|
||||
|
||||
if protocol in ('m3u8', 'm3u8_native'):
|
||||
if self.params.get('hls_use_mpegts', False) or tmpfilename == '-':
|
||||
args += ['-f', 'mpegts']
|
||||
else:
|
||||
args += ['-f', 'mp4', '-bsf:a', 'aac_adtstoasc']
|
||||
args += ['-f', 'mp4']
|
||||
if (ffpp.basename == 'ffmpeg' and is_outdated_version(ffpp._versions['ffmpeg'], '3.2', False)) and (not info_dict.get('acodec') or info_dict['acodec'].split('.')[0] in ('aac', 'mp4a')):
|
||||
args += ['-bsf:a', 'aac_adtstoasc']
|
||||
elif protocol == 'rtmp':
|
||||
args += ['-f', 'flv']
|
||||
else:
|
||||
|
@ -61,6 +61,7 @@ class FragmentFD(FileDownloader):
|
||||
'noprogress': True,
|
||||
'ratelimit': self.params.get('ratelimit'),
|
||||
'retries': self.params.get('retries', 0),
|
||||
'nopart': self.params.get('nopart', False),
|
||||
'test': self.params.get('test', False),
|
||||
}
|
||||
)
|
||||
|
@ -65,6 +65,9 @@ class HlsFD(FragmentFD):
|
||||
s = manifest.decode('utf-8', 'ignore')
|
||||
|
||||
if not self.can_download(s, info_dict):
|
||||
if info_dict.get('extra_param_to_segment_url'):
|
||||
self.report_error('pycrypto not found. Please install it.')
|
||||
return False
|
||||
self.report_warning(
|
||||
'hlsnative has detected features it does not support, '
|
||||
'extraction will be delegated to ffmpeg')
|
||||
|
@ -238,7 +238,10 @@ class IsmFD(FragmentFD):
|
||||
count = 0
|
||||
while count <= fragment_retries:
|
||||
try:
|
||||
success = ctx['dl'].download(target_filename, {'url': segment_url})
|
||||
success = ctx['dl'].download(target_filename, {
|
||||
'url': segment_url,
|
||||
'http_headers': info_dict.get('http_headers'),
|
||||
})
|
||||
if not success:
|
||||
return False
|
||||
down, target_sanitized = sanitize_open(target_filename, 'rb')
|
||||
|
@ -23,7 +23,7 @@ class AbcNewsVideoIE(AMPIE):
|
||||
'title': '\'This Week\' Exclusive: Iran\'s Foreign Minister Zarif',
|
||||
'description': 'George Stephanopoulos goes one-on-one with Iranian Foreign Minister Dr. Javad Zarif.',
|
||||
'duration': 180,
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
},
|
||||
'params': {
|
||||
# m3u8 download
|
||||
@ -59,7 +59,7 @@ class AbcNewsIE(InfoExtractor):
|
||||
'display_id': 'dramatic-video-rare-death-job-america',
|
||||
'title': 'Occupational Hazards',
|
||||
'description': 'Nightline investigates the dangers that lurk at various jobs.',
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
'upload_date': '20100428',
|
||||
'timestamp': 1272412800,
|
||||
},
|
||||
|
@ -23,7 +23,7 @@ class ABCOTVSIE(InfoExtractor):
|
||||
'ext': 'mp4',
|
||||
'title': 'East Bay museum celebrates vintage synthesizers',
|
||||
'description': 'md5:a4f10fb2f2a02565c1749d4adbab4b10',
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
'timestamp': 1421123075,
|
||||
'upload_date': '20150113',
|
||||
'uploader': 'Jonathan Bloom',
|
||||
|
@ -8,6 +8,7 @@ from .common import InfoExtractor
|
||||
from ..compat import compat_str
|
||||
from ..utils import (
|
||||
int_or_none,
|
||||
parse_iso8601,
|
||||
OnDemandPagedList,
|
||||
)
|
||||
|
||||
@ -15,18 +16,33 @@ from ..utils import (
|
||||
class ACastIE(InfoExtractor):
|
||||
IE_NAME = 'acast'
|
||||
_VALID_URL = r'https?://(?:www\.)?acast\.com/(?P<channel>[^/]+)/(?P<id>[^/#?]+)'
|
||||
_TEST = {
|
||||
_TESTS = [{
|
||||
# test with one bling
|
||||
'url': 'https://www.acast.com/condenasttraveler/-where-are-you-taipei-101-taiwan',
|
||||
'md5': 'ada3de5a1e3a2a381327d749854788bb',
|
||||
'info_dict': {
|
||||
'id': '57de3baa-4bb0-487e-9418-2692c1277a34',
|
||||
'ext': 'mp3',
|
||||
'title': '"Where Are You?": Taipei 101, Taiwan',
|
||||
'timestamp': 1196172000000,
|
||||
'timestamp': 1196172000,
|
||||
'upload_date': '20071127',
|
||||
'description': 'md5:a0b4ef3634e63866b542e5b1199a1a0e',
|
||||
'duration': 211,
|
||||
}
|
||||
}
|
||||
}, {
|
||||
# test with multiple blings
|
||||
'url': 'https://www.acast.com/sparpodcast/2.raggarmordet-rosterurdetforflutna',
|
||||
'md5': '55c0097badd7095f494c99a172f86501',
|
||||
'info_dict': {
|
||||
'id': '2a92b283-1a75-4ad8-8396-499c641de0d9',
|
||||
'ext': 'mp3',
|
||||
'title': '2. Raggarmordet - Röster ur det förflutna',
|
||||
'timestamp': 1477346700,
|
||||
'upload_date': '20161024',
|
||||
'description': 'md5:4f81f6d8cf2e12ee21a321d8bca32db4',
|
||||
'duration': 2797,
|
||||
}
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
channel, display_id = re.match(self._VALID_URL, url).groups()
|
||||
@ -35,11 +51,11 @@ class ACastIE(InfoExtractor):
|
||||
return {
|
||||
'id': compat_str(cast_data['id']),
|
||||
'display_id': display_id,
|
||||
'url': cast_data['blings'][0]['audio'],
|
||||
'url': [b['audio'] for b in cast_data['blings'] if b['type'] == 'BlingAudio'][0],
|
||||
'title': cast_data['name'],
|
||||
'description': cast_data.get('description'),
|
||||
'thumbnail': cast_data.get('image'),
|
||||
'timestamp': int_or_none(cast_data.get('publishingDate')),
|
||||
'timestamp': parse_iso8601(cast_data.get('publishingDate')),
|
||||
'duration': int_or_none(cast_data.get('duration')),
|
||||
}
|
||||
|
||||
|
@ -25,7 +25,8 @@ class AddAnimeIE(InfoExtractor):
|
||||
'ext': 'mp4',
|
||||
'description': 'One Piece 606',
|
||||
'title': 'One Piece 606',
|
||||
}
|
||||
},
|
||||
'skip': 'Video is gone',
|
||||
}, {
|
||||
'url': 'http://add-anime.net/video/MDUGWYKNGBD8/One-Piece-687',
|
||||
'only_matching': True,
|
||||
|
@ -31,6 +31,16 @@ MSO_INFO = {
|
||||
'username_field': 'user',
|
||||
'password_field': 'passwd',
|
||||
},
|
||||
'TWC': {
|
||||
'name': 'Time Warner Cable | Spectrum',
|
||||
'username_field': 'Ecom_User_ID',
|
||||
'password_field': 'Ecom_Password',
|
||||
},
|
||||
'Charter_Direct': {
|
||||
'name': 'Charter Spectrum',
|
||||
'username_field': 'IDToken1',
|
||||
'password_field': 'IDToken2',
|
||||
},
|
||||
'thr030': {
|
||||
'name': '3 Rivers Communications'
|
||||
},
|
||||
|
@ -30,7 +30,7 @@ class AdobeTVIE(AdobeTVBaseIE):
|
||||
'ext': 'mp4',
|
||||
'title': 'Quick Tip - How to Draw a Circle Around an Object in Photoshop',
|
||||
'description': 'md5:99ec318dc909d7ba2a1f2b038f7d2311',
|
||||
'thumbnail': 're:https?://.*\.jpg$',
|
||||
'thumbnail': r're:https?://.*\.jpg$',
|
||||
'upload_date': '20110914',
|
||||
'duration': 60,
|
||||
'view_count': int,
|
||||
|
@ -23,7 +23,7 @@ class AENetworksBaseIE(ThePlatformIE):
|
||||
class AENetworksIE(AENetworksBaseIE):
|
||||
IE_NAME = 'aenetworks'
|
||||
IE_DESC = 'A+E Networks: A&E, Lifetime, History.com, FYI Network'
|
||||
_VALID_URL = r'https?://(?:www\.)?(?P<domain>(?:history|aetv|mylifetime)\.com|fyi\.tv)/(?:shows/(?P<show_path>[^/]+(?:/[^/]+){0,2})|movies/(?P<movie_display_id>[^/]+)/full-movie)'
|
||||
_VALID_URL = r'https?://(?:www\.)?(?P<domain>(?:history|aetv|mylifetime|lifetimemovieclub)\.com|fyi\.tv)/(?:shows/(?P<show_path>[^/]+(?:/[^/]+){0,2})|movies/(?P<movie_display_id>[^/]+)(?:/full-movie)?)'
|
||||
_TESTS = [{
|
||||
'url': 'http://www.history.com/shows/mountain-men/season-1/episode-1',
|
||||
'md5': 'a97a65f7e823ae10e9244bc5433d5fe6',
|
||||
@ -62,11 +62,15 @@ class AENetworksIE(AENetworksBaseIE):
|
||||
}, {
|
||||
'url': 'http://www.mylifetime.com/movies/center-stage-on-pointe/full-movie',
|
||||
'only_matching': True
|
||||
}, {
|
||||
'url': 'https://www.lifetimemovieclub.com/movies/a-killer-among-us',
|
||||
'only_matching': True
|
||||
}]
|
||||
_DOMAIN_TO_REQUESTOR_ID = {
|
||||
'history.com': 'HISTORY',
|
||||
'aetv.com': 'AETV',
|
||||
'mylifetime.com': 'LIFETIME',
|
||||
'lifetimemovieclub.com': 'LIFETIMEMOVIECLUB',
|
||||
'fyi.tv': 'FYI',
|
||||
}
|
||||
|
||||
@ -87,7 +91,7 @@ class AENetworksIE(AENetworksBaseIE):
|
||||
self._html_search_meta('aetn:SeriesTitle', webpage))
|
||||
elif url_parts_len == 2:
|
||||
entries = []
|
||||
for episode_item in re.findall(r'(?s)<div[^>]+class="[^"]*episode-item[^"]*"[^>]*>', webpage):
|
||||
for episode_item in re.findall(r'(?s)<[^>]+class="[^"]*(?:episode|program)-item[^"]*"[^>]*>', webpage):
|
||||
episode_attributes = extract_attributes(episode_item)
|
||||
episode_url = compat_urlparse.urljoin(
|
||||
url, episode_attributes['data-canonical'])
|
||||
|
@ -18,6 +18,7 @@ from ..utils import (
|
||||
|
||||
|
||||
class AfreecaTVIE(InfoExtractor):
|
||||
IE_NAME = 'afreecatv'
|
||||
IE_DESC = 'afreecatv.com'
|
||||
_VALID_URL = r'''(?x)
|
||||
https?://
|
||||
@ -143,3 +144,107 @@ class AfreecaTVIE(InfoExtractor):
|
||||
expected=True)
|
||||
|
||||
return info
|
||||
|
||||
|
||||
class AfreecaTVGlobalIE(AfreecaTVIE):
|
||||
IE_NAME = 'afreecatv:global'
|
||||
_VALID_URL = r'https?://(?:www\.)?afreeca\.tv/(?P<channel_id>\d+)(?:/v/(?P<video_id>\d+))?'
|
||||
_TESTS = [{
|
||||
'url': 'http://afreeca.tv/36853014/v/58301',
|
||||
'info_dict': {
|
||||
'id': '58301',
|
||||
'title': 'tryhard top100',
|
||||
'uploader_id': '36853014',
|
||||
'uploader': 'makgi Hearthstone Live!',
|
||||
},
|
||||
'playlist_count': 3,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
channel_id, video_id = re.match(self._VALID_URL, url).groups()
|
||||
video_type = 'video' if video_id else 'live'
|
||||
query = {
|
||||
'pt': 'view',
|
||||
'bid': channel_id,
|
||||
}
|
||||
if video_id:
|
||||
query['vno'] = video_id
|
||||
video_data = self._download_json(
|
||||
'http://api.afreeca.tv/%s/view_%s.php' % (video_type, video_type),
|
||||
video_id or channel_id, query=query)['channel']
|
||||
|
||||
if video_data.get('result') != 1:
|
||||
raise ExtractorError('%s said: %s' % (self.IE_NAME, video_data['remsg']))
|
||||
|
||||
title = video_data['title']
|
||||
|
||||
info = {
|
||||
'thumbnail': video_data.get('thumb'),
|
||||
'view_count': int_or_none(video_data.get('vcnt')),
|
||||
'age_limit': int_or_none(video_data.get('grade')),
|
||||
'uploader_id': channel_id,
|
||||
'uploader': video_data.get('cname'),
|
||||
}
|
||||
|
||||
if video_id:
|
||||
entries = []
|
||||
for i, f in enumerate(video_data.get('flist', [])):
|
||||
video_key = self.parse_video_key(f.get('key', ''))
|
||||
f_url = f.get('file')
|
||||
if not video_key or not f_url:
|
||||
continue
|
||||
entries.append({
|
||||
'id': '%s_%s' % (video_id, video_key.get('part', i + 1)),
|
||||
'title': title,
|
||||
'upload_date': video_key.get('upload_date'),
|
||||
'duration': int_or_none(f.get('length')),
|
||||
'url': f_url,
|
||||
'protocol': 'm3u8_native',
|
||||
'ext': 'mp4',
|
||||
})
|
||||
|
||||
info.update({
|
||||
'id': video_id,
|
||||
'title': title,
|
||||
'duration': int_or_none(video_data.get('length')),
|
||||
})
|
||||
if len(entries) > 1:
|
||||
info['_type'] = 'multi_video'
|
||||
info['entries'] = entries
|
||||
elif len(entries) == 1:
|
||||
i = entries[0].copy()
|
||||
i.update(info)
|
||||
info = i
|
||||
else:
|
||||
formats = []
|
||||
for s in video_data.get('strm', []):
|
||||
s_url = s.get('purl')
|
||||
if not s_url:
|
||||
continue
|
||||
stype = s.get('stype')
|
||||
if stype == 'HLS':
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
s_url, channel_id, 'mp4', m3u8_id=stype, fatal=False))
|
||||
elif stype == 'RTMP':
|
||||
format_id = [stype]
|
||||
label = s.get('label')
|
||||
if label:
|
||||
format_id.append(label)
|
||||
formats.append({
|
||||
'format_id': '-'.join(format_id),
|
||||
'url': s_url,
|
||||
'tbr': int_or_none(s.get('bps')),
|
||||
'height': int_or_none(s.get('brt')),
|
||||
'ext': 'flv',
|
||||
'rtmp_live': True,
|
||||
})
|
||||
self._sort_formats(formats)
|
||||
|
||||
info.update({
|
||||
'id': channel_id,
|
||||
'title': self._live_title(title),
|
||||
'is_live': True,
|
||||
'formats': formats,
|
||||
})
|
||||
|
||||
return info
|
||||
|
@ -20,7 +20,7 @@ class AirMozillaIE(InfoExtractor):
|
||||
'id': '6x4q2w',
|
||||
'ext': 'mp4',
|
||||
'title': 'Privacy Lab - a meetup for privacy minded people in San Francisco',
|
||||
'thumbnail': 're:https?://vid\.ly/(?P<id>[0-9a-z-]+)/poster',
|
||||
'thumbnail': r're:https?://vid\.ly/(?P<id>[0-9a-z-]+)/poster',
|
||||
'description': 'Brings together privacy professionals and others interested in privacy at for-profits, non-profits, and NGOs in an effort to contribute to the state of the ecosystem...',
|
||||
'timestamp': 1422487800,
|
||||
'upload_date': '20150128',
|
||||
|
@ -21,7 +21,7 @@ class AllocineIE(InfoExtractor):
|
||||
'ext': 'mp4',
|
||||
'title': 'Astérix - Le Domaine des Dieux Teaser VF',
|
||||
'description': 'md5:4a754271d9c6f16c72629a8a993ee884',
|
||||
'thumbnail': 're:http://.*\.jpg',
|
||||
'thumbnail': r're:http://.*\.jpg',
|
||||
},
|
||||
}, {
|
||||
'url': 'http://www.allocine.fr/video/player_gen_cmedia=19540403&cfilm=222257.html',
|
||||
@ -32,7 +32,7 @@ class AllocineIE(InfoExtractor):
|
||||
'ext': 'mp4',
|
||||
'title': 'Planes 2 Bande-annonce VF',
|
||||
'description': 'Regardez la bande annonce du film Planes 2 (Planes 2 Bande-annonce VF). Planes 2, un film de Roberts Gannaway',
|
||||
'thumbnail': 're:http://.*\.jpg',
|
||||
'thumbnail': r're:http://.*\.jpg',
|
||||
},
|
||||
}, {
|
||||
'url': 'http://www.allocine.fr/video/player_gen_cmedia=19544709&cfilm=181290.html',
|
||||
@ -43,7 +43,7 @@ class AllocineIE(InfoExtractor):
|
||||
'ext': 'mp4',
|
||||
'title': 'Dragons 2 - Bande annonce finale VF',
|
||||
'description': 'md5:6cdd2d7c2687d4c6aafe80a35e17267a',
|
||||
'thumbnail': 're:http://.*\.jpg',
|
||||
'thumbnail': r're:http://.*\.jpg',
|
||||
},
|
||||
}, {
|
||||
'url': 'http://www.allocine.fr/video/video-19550147/',
|
||||
@ -53,7 +53,7 @@ class AllocineIE(InfoExtractor):
|
||||
'ext': 'mp4',
|
||||
'title': 'Faux Raccord N°123 - Les gaffes de Cliffhanger',
|
||||
'description': 'md5:bc734b83ffa2d8a12188d9eb48bb6354',
|
||||
'thumbnail': 're:http://.*\.jpg',
|
||||
'thumbnail': r're:http://.*\.jpg',
|
||||
},
|
||||
}]
|
||||
|
||||
|
@ -19,7 +19,7 @@ class AlphaPornoIE(InfoExtractor):
|
||||
'display_id': 'sensual-striptease-porn-with-samantha-alexandra',
|
||||
'ext': 'mp4',
|
||||
'title': 'Sensual striptease porn with Samantha Alexandra',
|
||||
'thumbnail': 're:https?://.*\.jpg$',
|
||||
'thumbnail': r're:https?://.*\.jpg$',
|
||||
'timestamp': 1418694611,
|
||||
'upload_date': '20141216',
|
||||
'duration': 387,
|
||||
|
@ -10,7 +10,7 @@ from ..utils import (
|
||||
|
||||
|
||||
class AMCNetworksIE(ThePlatformIE):
|
||||
_VALID_URL = r'https?://(?:www\.)?(?:amc|bbcamerica|ifc|wetv)\.com/(?:movies/|shows/[^/]+/(?:full-episodes/)?[^/]+/episode-\d+(?:-(?:[^/]+/)?|/))(?P<id>[^/?#]+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?(?:amc|bbcamerica|ifc|wetv)\.com/(?:movies|shows(?:/[^/]+)+)/(?P<id>[^/?#]+)'
|
||||
_TESTS = [{
|
||||
'url': 'http://www.ifc.com/shows/maron/season-04/episode-01/step-1',
|
||||
'md5': '',
|
||||
@ -44,6 +44,12 @@ class AMCNetworksIE(ThePlatformIE):
|
||||
}, {
|
||||
'url': 'http://www.bbcamerica.com/shows/doctor-who/full-episodes/the-power-of-the-daleks/episode-01-episode-1-color-version',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://www.wetv.com/shows/mama-june-from-not-to-hot/full-episode/season-01/thin-tervention',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://www.wetv.com/shows/la-hair/videos/season-05/episode-09-episode-9-2/episode-9-sneak-peek-3',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
@ -53,20 +59,30 @@ class AMCNetworksIE(ThePlatformIE):
|
||||
'mbr': 'true',
|
||||
'manifest': 'm3u',
|
||||
}
|
||||
media_url = self._search_regex(r'window\.platformLinkURL\s*=\s*[\'"]([^\'"]+)', webpage, 'media url')
|
||||
media_url = self._search_regex(
|
||||
r'window\.platformLinkURL\s*=\s*[\'"]([^\'"]+)',
|
||||
webpage, 'media url')
|
||||
theplatform_metadata = self._download_theplatform_metadata(self._search_regex(
|
||||
r'https?://link.theplatform.com/s/([^?]+)', media_url, 'theplatform_path'), display_id)
|
||||
r'link\.theplatform\.com/s/([^?]+)',
|
||||
media_url, 'theplatform_path'), display_id)
|
||||
info = self._parse_theplatform_metadata(theplatform_metadata)
|
||||
video_id = theplatform_metadata['pid']
|
||||
title = theplatform_metadata['title']
|
||||
rating = theplatform_metadata['ratings'][0]['rating']
|
||||
auth_required = self._search_regex(r'window\.authRequired\s*=\s*(true|false);', webpage, 'auth required')
|
||||
auth_required = self._search_regex(
|
||||
r'window\.authRequired\s*=\s*(true|false);',
|
||||
webpage, 'auth required')
|
||||
if auth_required == 'true':
|
||||
requestor_id = self._search_regex(r'window\.requestor_id\s*=\s*[\'"]([^\'"]+)', webpage, 'requestor id')
|
||||
resource = self._get_mvpd_resource(requestor_id, title, video_id, rating)
|
||||
query['auth'] = self._extract_mvpd_auth(url, video_id, requestor_id, resource)
|
||||
requestor_id = self._search_regex(
|
||||
r'window\.requestor_id\s*=\s*[\'"]([^\'"]+)',
|
||||
webpage, 'requestor id')
|
||||
resource = self._get_mvpd_resource(
|
||||
requestor_id, title, video_id, rating)
|
||||
query['auth'] = self._extract_mvpd_auth(
|
||||
url, video_id, requestor_id, resource)
|
||||
media_url = update_url_query(media_url, query)
|
||||
formats, subtitles = self._extract_theplatform_smil(media_url, video_id)
|
||||
formats, subtitles = self._extract_theplatform_smil(
|
||||
media_url, video_id)
|
||||
self._sort_formats(formats)
|
||||
info.update({
|
||||
'id': video_id,
|
||||
@ -78,9 +94,11 @@ class AMCNetworksIE(ThePlatformIE):
|
||||
if ns_keys:
|
||||
ns = list(ns_keys)[0]
|
||||
series = theplatform_metadata.get(ns + '$show')
|
||||
season_number = int_or_none(theplatform_metadata.get(ns + '$season'))
|
||||
season_number = int_or_none(
|
||||
theplatform_metadata.get(ns + '$season'))
|
||||
episode = theplatform_metadata.get(ns + '$episodeTitle')
|
||||
episode_number = int_or_none(theplatform_metadata.get(ns + '$episode'))
|
||||
episode_number = int_or_none(
|
||||
theplatform_metadata.get(ns + '$episode'))
|
||||
if season_number:
|
||||
title = 'Season %d - %s' % (season_number, title)
|
||||
if series:
|
||||
|
@ -12,7 +12,7 @@ from ..utils import (
|
||||
|
||||
class AolIE(InfoExtractor):
|
||||
IE_NAME = 'on.aol.com'
|
||||
_VALID_URL = r'(?:aol-video:|https?://on\.aol\.com/(?:[^/]+/)*(?:[^/?#&]+-)?)(?P<id>[^/?#&]+)'
|
||||
_VALID_URL = r'(?:aol-video:|https?://(?:(?:www|on)\.)?aol\.com/(?:[^/]+/)*(?:[^/?#&]+-)?)(?P<id>[^/?#&]+)'
|
||||
|
||||
_TESTS = [{
|
||||
# video with 5min ID
|
||||
@ -33,7 +33,7 @@ class AolIE(InfoExtractor):
|
||||
}
|
||||
}, {
|
||||
# video with vidible ID
|
||||
'url': 'http://on.aol.com/video/netflix-is-raising-rates-5707d6b8e4b090497b04f706?context=PC:homepage:PL1944:1460189336183',
|
||||
'url': 'http://www.aol.com/video/view/netflix-is-raising-rates/5707d6b8e4b090497b04f706/',
|
||||
'info_dict': {
|
||||
'id': '5707d6b8e4b090497b04f706',
|
||||
'ext': 'mp4',
|
||||
@ -108,30 +108,3 @@ class AolIE(InfoExtractor):
|
||||
'uploader': video_data.get('videoOwner'),
|
||||
'formats': formats,
|
||||
}
|
||||
|
||||
|
||||
class AolFeaturesIE(InfoExtractor):
|
||||
IE_NAME = 'features.aol.com'
|
||||
_VALID_URL = r'https?://features\.aol\.com/video/(?P<id>[^/?#]+)'
|
||||
|
||||
_TESTS = [{
|
||||
'url': 'http://features.aol.com/video/behind-secret-second-careers-late-night-talk-show-hosts',
|
||||
'md5': '7db483bb0c09c85e241f84a34238cc75',
|
||||
'info_dict': {
|
||||
'id': '519507715',
|
||||
'ext': 'mp4',
|
||||
'title': 'What To Watch - February 17, 2016',
|
||||
},
|
||||
'add_ie': ['FiveMin'],
|
||||
'params': {
|
||||
# encrypted m3u8 download
|
||||
'skip_download': True,
|
||||
},
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
display_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, display_id)
|
||||
return self.url_result(self._search_regex(
|
||||
r'<script type="text/javascript" src="(https?://[^/]*?5min\.com/Scripts/PlayerSeed\.js[^"]+)"',
|
||||
webpage, '5min embed url'), 'FiveMin')
|
||||
|
@ -1,13 +1,13 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .jwplatform import JWPlatformBaseIE
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
unified_strdate,
|
||||
clean_html,
|
||||
)
|
||||
|
||||
|
||||
class ArchiveOrgIE(JWPlatformBaseIE):
|
||||
class ArchiveOrgIE(InfoExtractor):
|
||||
IE_NAME = 'archive.org'
|
||||
IE_DESC = 'archive.org videos'
|
||||
_VALID_URL = r'https?://(?:www\.)?archive\.org/(?:details|embed)/(?P<id>[^/?#]+)(?:[?].*)?$'
|
||||
|
@ -253,7 +253,7 @@ class ARDIE(InfoExtractor):
|
||||
'duration': 2600,
|
||||
'title': 'Die Story im Ersten: Mission unter falscher Flagge',
|
||||
'upload_date': '20140804',
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
},
|
||||
'skip': 'HTTP Error 404: Not Found',
|
||||
}
|
||||
|
@ -4,8 +4,10 @@ from __future__ import unicode_literals
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import compat_urlparse
|
||||
from ..utils import (
|
||||
determine_ext,
|
||||
ExtractorError,
|
||||
float_or_none,
|
||||
int_or_none,
|
||||
mimetype2ext,
|
||||
@ -15,7 +17,13 @@ from ..utils import (
|
||||
|
||||
|
||||
class ArkenaIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://play\.arkena\.com/(?:config|embed)/avp/v\d/player/media/(?P<id>[^/]+)/[^/]+/(?P<account_id>\d+)'
|
||||
_VALID_URL = r'''(?x)
|
||||
https?://
|
||||
(?:
|
||||
video\.arkena\.com/play2/embed/player\?|
|
||||
play\.arkena\.com/(?:config|embed)/avp/v\d/player/media/(?P<id>[^/]+)/[^/]+/(?P<account_id>\d+)
|
||||
)
|
||||
'''
|
||||
_TESTS = [{
|
||||
'url': 'https://play.arkena.com/embed/avp/v2/player/media/b41dda37-d8e7-4d3f-b1b5-9a9db578bdfe/1/129411',
|
||||
'md5': 'b96f2f71b359a8ecd05ce4e1daa72365',
|
||||
@ -37,6 +45,9 @@ class ArkenaIE(InfoExtractor):
|
||||
}, {
|
||||
'url': 'http://play.arkena.com/embed/avp/v1/player/media/327336/darkmatter/131064/',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://video.arkena.com/play2/embed/player?accountId=472718&mediaId=35763b3b-00090078-bf604299&pageStyling=styled',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
@staticmethod
|
||||
@ -53,6 +64,14 @@ class ArkenaIE(InfoExtractor):
|
||||
video_id = mobj.group('id')
|
||||
account_id = mobj.group('account_id')
|
||||
|
||||
# Handle http://video.arkena.com/play2/embed/player URL
|
||||
if not video_id:
|
||||
qs = compat_urlparse.parse_qs(compat_urlparse.urlparse(url).query)
|
||||
video_id = qs.get('mediaId', [None])[0]
|
||||
account_id = qs.get('accountId', [None])[0]
|
||||
if not video_id or not account_id:
|
||||
raise ExtractorError('Invalid URL', expected=True)
|
||||
|
||||
playlist = self._download_json(
|
||||
'https://play.arkena.com/config/avp/v2/player/media/%s/0/%s/?callbackMethod=_'
|
||||
% (video_id, account_id),
|
||||
|
@ -30,7 +30,7 @@ class AtresPlayerIE(InfoExtractor):
|
||||
'title': 'Especial Solidario de Nochebuena',
|
||||
'description': 'md5:e2d52ff12214fa937107d21064075bf1',
|
||||
'duration': 5527.6,
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
},
|
||||
'skip': 'This video is only available for registered users'
|
||||
},
|
||||
@ -43,7 +43,7 @@ class AtresPlayerIE(InfoExtractor):
|
||||
'title': 'David Bustamante',
|
||||
'description': 'md5:f33f1c0a05be57f6708d4dd83a3b81c6',
|
||||
'duration': 1439.0,
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
},
|
||||
},
|
||||
{
|
||||
|
@ -14,7 +14,7 @@ class ATTTechChannelIE(InfoExtractor):
|
||||
'ext': 'flv',
|
||||
'title': 'AT&T Archives : The UNIX System: Making Computers Easier to Use',
|
||||
'description': 'A 1982 film about UNIX is the foundation for software in use around Bell Labs and AT&T.',
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
'upload_date': '20140127',
|
||||
},
|
||||
'params': {
|
||||
|
@ -17,7 +17,7 @@ class AudioBoomIE(InfoExtractor):
|
||||
'description': 'Guest: Nate Davis - NFL free agency, Guest: Stan Gans',
|
||||
'duration': 2245.72,
|
||||
'uploader': 'Steve Czaban',
|
||||
'uploader_url': 're:https?://(?:www\.)?audioboom\.com/channel/steveczabanyahoosportsradio',
|
||||
'uploader_url': r're:https?://(?:www\.)?audioboom\.com/channel/steveczabanyahoosportsradio',
|
||||
}
|
||||
}, {
|
||||
'url': 'https://audioboom.com/posts/4279833-3-09-2016-czaban-hour-3?t=0',
|
||||
|
213
youtube_dl/extractor/azmedien.py
Normal file
213
youtube_dl/extractor/azmedien.py
Normal file
@ -0,0 +1,213 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from .kaltura import KalturaIE
|
||||
from ..utils import (
|
||||
get_element_by_class,
|
||||
get_element_by_id,
|
||||
strip_or_none,
|
||||
urljoin,
|
||||
)
|
||||
|
||||
|
||||
class AZMedienBaseIE(InfoExtractor):
|
||||
def _kaltura_video(self, partner_id, entry_id):
|
||||
return self.url_result(
|
||||
'kaltura:%s:%s' % (partner_id, entry_id), ie=KalturaIE.ie_key(),
|
||||
video_id=entry_id)
|
||||
|
||||
|
||||
class AZMedienIE(AZMedienBaseIE):
|
||||
IE_DESC = 'AZ Medien videos'
|
||||
_VALID_URL = r'''(?x)
|
||||
https?://
|
||||
(?:www\.)?
|
||||
(?:
|
||||
telezueri\.ch|
|
||||
telebaern\.tv|
|
||||
telem1\.ch
|
||||
)/
|
||||
[0-9]+-show-[^/\#]+
|
||||
(?:
|
||||
/[0-9]+-episode-[^/\#]+
|
||||
(?:
|
||||
/[0-9]+-segment-(?:[^/\#]+\#)?|
|
||||
\#
|
||||
)|
|
||||
\#
|
||||
)
|
||||
(?P<id>[^\#]+)
|
||||
'''
|
||||
|
||||
_TESTS = [{
|
||||
# URL with 'segment'
|
||||
'url': 'http://www.telezueri.ch/62-show-zuerinews/13772-episode-sonntag-18-dezember-2016/32419-segment-massenabweisungen-beim-hiltl-club-wegen-pelzboom',
|
||||
'info_dict': {
|
||||
'id': '1_2444peh4',
|
||||
'ext': 'mov',
|
||||
'title': 'Massenabweisungen beim Hiltl Club wegen Pelzboom',
|
||||
'description': 'md5:9ea9dd1b159ad65b36ddcf7f0d7c76a8',
|
||||
'uploader_id': 'TeleZ?ri',
|
||||
'upload_date': '20161218',
|
||||
'timestamp': 1482084490,
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
},
|
||||
}, {
|
||||
# URL with 'segment' and fragment:
|
||||
'url': 'http://www.telebaern.tv/118-show-news/14240-episode-dienstag-17-januar-2017/33666-segment-achtung-gefahr#zu-wenig-pflegerinnen-und-pfleger',
|
||||
'only_matching': True
|
||||
}, {
|
||||
# URL with 'episode' and fragment:
|
||||
'url': 'http://www.telem1.ch/47-show-sonntalk/13986-episode-soldaten-fuer-grenzschutz-energiestrategie-obama-bilanz#soldaten-fuer-grenzschutz-energiestrategie-obama-bilanz',
|
||||
'only_matching': True
|
||||
}, {
|
||||
# URL with 'show' and fragment:
|
||||
'url': 'http://www.telezueri.ch/66-show-sonntalk#burka-plakate-trump-putin-china-besuch',
|
||||
'only_matching': True
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
partner_id = self._search_regex(
|
||||
r'<script[^>]+src=["\'](?:https?:)?//(?:[^/]+\.)?kaltura\.com(?:/[^/]+)*/(?:p|partner_id)/([0-9]+)',
|
||||
webpage, 'kaltura partner id')
|
||||
entry_id = self._html_search_regex(
|
||||
r'<a[^>]+data-id=(["\'])(?P<id>(?:(?!\1).)+)\1[^>]+data-slug=["\']%s'
|
||||
% re.escape(video_id), webpage, 'kaltura entry id', group='id')
|
||||
|
||||
return self._kaltura_video(partner_id, entry_id)
|
||||
|
||||
|
||||
class AZMedienPlaylistIE(AZMedienBaseIE):
|
||||
IE_DESC = 'AZ Medien playlists'
|
||||
_VALID_URL = r'''(?x)
|
||||
https?://
|
||||
(?:www\.)?
|
||||
(?:
|
||||
telezueri\.ch|
|
||||
telebaern\.tv|
|
||||
telem1\.ch
|
||||
)/
|
||||
(?P<id>[0-9]+-
|
||||
(?:
|
||||
show|
|
||||
topic|
|
||||
themen
|
||||
)-[^/\#]+
|
||||
(?:
|
||||
/[0-9]+-episode-[^/\#]+
|
||||
)?
|
||||
)$
|
||||
'''
|
||||
|
||||
_TESTS = [{
|
||||
# URL with 'episode'
|
||||
'url': 'http://www.telebaern.tv/118-show-news/13735-episode-donnerstag-15-dezember-2016',
|
||||
'info_dict': {
|
||||
'id': '118-show-news/13735-episode-donnerstag-15-dezember-2016',
|
||||
'title': 'News - Donnerstag, 15. Dezember 2016',
|
||||
},
|
||||
'playlist_count': 9,
|
||||
}, {
|
||||
# URL with 'themen'
|
||||
'url': 'http://www.telem1.ch/258-themen-tele-m1-classics',
|
||||
'info_dict': {
|
||||
'id': '258-themen-tele-m1-classics',
|
||||
'title': 'Tele M1 Classics',
|
||||
},
|
||||
'playlist_mincount': 15,
|
||||
}, {
|
||||
# URL with 'topic', contains nested playlists
|
||||
'url': 'http://www.telezueri.ch/219-topic-aera-trump-hat-offiziell-begonnen',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
# URL with 'show' only
|
||||
'url': 'http://www.telezueri.ch/86-show-talktaeglich',
|
||||
'only_matching': True
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
show_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, show_id)
|
||||
|
||||
entries = []
|
||||
|
||||
partner_id = self._search_regex(
|
||||
r'src=["\'](?:https?:)?//(?:[^/]+\.)kaltura\.com/(?:[^/]+/)*(?:p|partner_id)/(\d+)',
|
||||
webpage, 'kaltura partner id', default=None)
|
||||
|
||||
if partner_id:
|
||||
entries = [
|
||||
self._kaltura_video(partner_id, m.group('id'))
|
||||
for m in re.finditer(
|
||||
r'data-id=(["\'])(?P<id>(?:(?!\1).)+)\1', webpage)]
|
||||
|
||||
if not entries:
|
||||
entries = [
|
||||
self.url_result(m.group('url'), ie=AZMedienIE.ie_key())
|
||||
for m in re.finditer(
|
||||
r'<a[^>]+data-real=(["\'])(?P<url>http.+?)\1', webpage)]
|
||||
|
||||
if not entries:
|
||||
entries = [
|
||||
# May contain nested playlists (e.g. [1]) thus no explicit
|
||||
# ie_key
|
||||
# 1. http://www.telezueri.ch/219-topic-aera-trump-hat-offiziell-begonnen)
|
||||
self.url_result(urljoin(url, m.group('url')))
|
||||
for m in re.finditer(
|
||||
r'<a[^>]+name=[^>]+href=(["\'])(?P<url>/.+?)\1', webpage)]
|
||||
|
||||
title = self._search_regex(
|
||||
r'episodeShareTitle\s*=\s*(["\'])(?P<title>(?:(?!\1).)+)\1',
|
||||
webpage, 'title',
|
||||
default=strip_or_none(get_element_by_id(
|
||||
'video-title', webpage)), group='title')
|
||||
|
||||
return self.playlist_result(entries, show_id, title)
|
||||
|
||||
|
||||
class AZMedienShowPlaylistIE(AZMedienBaseIE):
|
||||
IE_DESC = 'AZ Medien show playlists'
|
||||
_VALID_URL = r'''(?x)
|
||||
https?://
|
||||
(?:www\.)?
|
||||
(?:
|
||||
telezueri\.ch|
|
||||
telebaern\.tv|
|
||||
telem1\.ch
|
||||
)/
|
||||
(?:
|
||||
all-episodes|
|
||||
alle-episoden
|
||||
)/
|
||||
(?P<id>[^/?#&]+)
|
||||
'''
|
||||
|
||||
_TEST = {
|
||||
'url': 'http://www.telezueri.ch/all-episodes/astrotalk',
|
||||
'info_dict': {
|
||||
'id': 'astrotalk',
|
||||
'title': 'TeleZüri: AstroTalk - alle episoden',
|
||||
'description': 'md5:4c0f7e7d741d906004266e295ceb4a26',
|
||||
},
|
||||
'playlist_mincount': 13,
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
playlist_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, playlist_id)
|
||||
episodes = get_element_by_class('search-mobile-box', webpage)
|
||||
entries = [self.url_result(
|
||||
urljoin(url, m.group('url'))) for m in re.finditer(
|
||||
r'<a[^>]+href=(["\'])(?P<url>(?:(?!\1).)+)\1', episodes)]
|
||||
title = self._og_search_title(webpage, fatal=False)
|
||||
description = self._og_search_description(webpage)
|
||||
return self.playlist_result(entries, playlist_id, title, description)
|
@ -21,7 +21,7 @@ class AzubuIE(InfoExtractor):
|
||||
'ext': 'mp4',
|
||||
'title': '2014 HOT6 CUP LAST BIG MATCH Ro8 Day 1',
|
||||
'description': 'md5:d06bdea27b8cc4388a90ad35b5c66c01',
|
||||
'thumbnail': 're:^https?://.*\.jpe?g',
|
||||
'thumbnail': r're:^https?://.*\.jpe?g',
|
||||
'timestamp': 1417523507.334,
|
||||
'upload_date': '20141202',
|
||||
'duration': 9988.7,
|
||||
@ -38,7 +38,7 @@ class AzubuIE(InfoExtractor):
|
||||
'ext': 'mp4',
|
||||
'title': 'Fnatic at Worlds 2014: Toyz - "I love Rekkles, he has amazing mechanics"',
|
||||
'description': 'md5:4a649737b5f6c8b5c5be543e88dc62af',
|
||||
'thumbnail': 're:^https?://.*\.jpe?g',
|
||||
'thumbnail': r're:^https?://.*\.jpe?g',
|
||||
'timestamp': 1410530893.320,
|
||||
'upload_date': '20140912',
|
||||
'duration': 172.385,
|
||||
|
@ -209,6 +209,15 @@ class BandcampAlbumIE(InfoExtractor):
|
||||
'id': 'entropy-ep',
|
||||
},
|
||||
'playlist_mincount': 3,
|
||||
}, {
|
||||
# not all tracks have songs
|
||||
'url': 'https://insulters.bandcamp.com/album/we-are-the-plague',
|
||||
'info_dict': {
|
||||
'id': 'we-are-the-plague',
|
||||
'title': 'WE ARE THE PLAGUE',
|
||||
'uploader_id': 'insulters',
|
||||
},
|
||||
'playlist_count': 2,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
@ -217,12 +226,16 @@ class BandcampAlbumIE(InfoExtractor):
|
||||
album_id = mobj.group('album_id')
|
||||
playlist_id = album_id or uploader_id
|
||||
webpage = self._download_webpage(url, playlist_id)
|
||||
tracks_paths = re.findall(r'<a href="(.*?)" itemprop="url">', webpage)
|
||||
if not tracks_paths:
|
||||
track_elements = re.findall(
|
||||
r'(?s)<div[^>]*>(.*?<a[^>]+href="([^"]+?)"[^>]+itemprop="url"[^>]*>.*?)</div>', webpage)
|
||||
if not track_elements:
|
||||
raise ExtractorError('The page doesn\'t contain any tracks')
|
||||
# Only tracks with duration info have songs
|
||||
entries = [
|
||||
self.url_result(compat_urlparse.urljoin(url, t_path), ie=BandcampIE.ie_key())
|
||||
for t_path in tracks_paths]
|
||||
for elem_content, t_path in track_elements
|
||||
if self._html_search_meta('duration', elem_content, default=None)]
|
||||
|
||||
title = self._html_search_regex(
|
||||
r'album_title\s*:\s*"((?:\\.|[^"\\])+?)"',
|
||||
webpage, 'title', fatal=False)
|
||||
|
@ -225,6 +225,8 @@ class BBCCoUkIE(InfoExtractor):
|
||||
}
|
||||
]
|
||||
|
||||
_USP_RE = r'/([^/]+?)\.ism(?:\.hlsv2\.ism)?/[^/]+\.m3u8'
|
||||
|
||||
class MediaSelectionError(Exception):
|
||||
def __init__(self, id):
|
||||
self.id = id
|
||||
@ -336,6 +338,15 @@ class BBCCoUkIE(InfoExtractor):
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
href, programme_id, ext='mp4', entry_protocol='m3u8_native',
|
||||
m3u8_id=format_id, fatal=False))
|
||||
if re.search(self._USP_RE, href):
|
||||
usp_formats = self._extract_m3u8_formats(
|
||||
re.sub(self._USP_RE, r'/\1.ism/\1.m3u8', href),
|
||||
programme_id, ext='mp4', entry_protocol='m3u8_native',
|
||||
m3u8_id=format_id, fatal=False)
|
||||
for f in usp_formats:
|
||||
if f.get('height') and f['height'] > 720:
|
||||
continue
|
||||
formats.append(f)
|
||||
elif transfer_format == 'hds':
|
||||
formats.extend(self._extract_f4m_formats(
|
||||
href, programme_id, f4m_id=format_id, fatal=False))
|
||||
|
73
youtube_dl/extractor/beampro.py
Normal file
73
youtube_dl/extractor/beampro.py
Normal file
@ -0,0 +1,73 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
ExtractorError,
|
||||
clean_html,
|
||||
compat_str,
|
||||
int_or_none,
|
||||
parse_iso8601,
|
||||
try_get,
|
||||
)
|
||||
|
||||
|
||||
class BeamProLiveIE(InfoExtractor):
|
||||
IE_NAME = 'Beam:live'
|
||||
_VALID_URL = r'https?://(?:\w+\.)?beam\.pro/(?P<id>[^/?#&]+)'
|
||||
_RATINGS = {'family': 0, 'teen': 13, '18+': 18}
|
||||
_TEST = {
|
||||
'url': 'http://www.beam.pro/niterhayven',
|
||||
'info_dict': {
|
||||
'id': '261562',
|
||||
'ext': 'mp4',
|
||||
'title': 'Introducing The Witcher 3 // The Grind Starts Now!',
|
||||
'description': 'md5:0b161ac080f15fe05d18a07adb44a74d',
|
||||
'thumbnail': r're:https://.*\.jpg$',
|
||||
'timestamp': 1483477281,
|
||||
'upload_date': '20170103',
|
||||
'uploader': 'niterhayven',
|
||||
'uploader_id': '373396',
|
||||
'age_limit': 18,
|
||||
'is_live': True,
|
||||
'view_count': int,
|
||||
},
|
||||
'skip': 'niterhayven is offline',
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
},
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
channel_name = self._match_id(url)
|
||||
|
||||
chan = self._download_json(
|
||||
'https://beam.pro/api/v1/channels/%s' % channel_name, channel_name)
|
||||
|
||||
if chan.get('online') is False:
|
||||
raise ExtractorError(
|
||||
'{0} is offline'.format(channel_name), expected=True)
|
||||
|
||||
channel_id = chan['id']
|
||||
|
||||
formats = self._extract_m3u8_formats(
|
||||
'https://beam.pro/api/v1/channels/%s/manifest.m3u8' % channel_id,
|
||||
channel_name, ext='mp4', m3u8_id='hls', fatal=False)
|
||||
self._sort_formats(formats)
|
||||
|
||||
user_id = chan.get('userId') or try_get(chan, lambda x: x['user']['id'])
|
||||
|
||||
return {
|
||||
'id': compat_str(chan.get('id') or channel_name),
|
||||
'title': self._live_title(chan.get('name') or channel_name),
|
||||
'description': clean_html(chan.get('description')),
|
||||
'thumbnail': try_get(chan, lambda x: x['thumbnail']['url'], compat_str),
|
||||
'timestamp': parse_iso8601(chan.get('updatedAt')),
|
||||
'uploader': chan.get('token') or try_get(
|
||||
chan, lambda x: x['user']['username'], compat_str),
|
||||
'uploader_id': compat_str(user_id) if user_id else None,
|
||||
'age_limit': self._RATINGS.get(chan.get('audience')),
|
||||
'is_live': True,
|
||||
'view_count': int_or_none(chan.get('viewersTotal')),
|
||||
'formats': formats,
|
||||
}
|
@ -24,7 +24,7 @@ class BellMediaIE(InfoExtractor):
|
||||
space
|
||||
)\.ca|
|
||||
much\.com
|
||||
)/.*?(?:\bvid=|-vid|~|%7E|/(?:episode)?)(?P<id>[0-9]{6})'''
|
||||
)/.*?(?:\bvid=|-vid|~|%7E|/(?:episode)?)(?P<id>[0-9]{6,})'''
|
||||
_TESTS = [{
|
||||
'url': 'http://www.ctv.ca/video/player?vid=706966',
|
||||
'md5': 'ff2ebbeae0aa2dcc32a830c3fd69b7b0',
|
||||
@ -55,6 +55,9 @@ class BellMediaIE(InfoExtractor):
|
||||
}, {
|
||||
'url': 'http://www.much.com/shows/the-almost-impossible-gameshow/928979/episode-6',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://www.ctv.ca/DCs-Legends-of-Tomorrow/Video/S2E11-Turncoat-vid1051430',
|
||||
'only_matching': True,
|
||||
}]
|
||||
_DOMAINS = {
|
||||
'thecomedynetwork': 'comedy',
|
||||
|
@ -17,7 +17,7 @@ class BetIE(MTVServicesInfoExtractor):
|
||||
'description': 'President Obama urges persistence in confronting racism and bias.',
|
||||
'duration': 1534,
|
||||
'upload_date': '20141208',
|
||||
'thumbnail': 're:(?i)^https?://.*\.jpg$',
|
||||
'thumbnail': r're:(?i)^https?://.*\.jpg$',
|
||||
'subtitles': {
|
||||
'en': 'mincount:2',
|
||||
}
|
||||
@ -37,7 +37,7 @@ class BetIE(MTVServicesInfoExtractor):
|
||||
'description': 'A BET News special.',
|
||||
'duration': 1696,
|
||||
'upload_date': '20141125',
|
||||
'thumbnail': 're:(?i)^https?://.*\.jpg$',
|
||||
'thumbnail': r're:(?i)^https?://.*\.jpg$',
|
||||
'subtitles': {
|
||||
'en': 'mincount:2',
|
||||
}
|
||||
|
@ -19,7 +19,7 @@ class BildIE(InfoExtractor):
|
||||
'ext': 'mp4',
|
||||
'title': 'Das können die neuen iPads',
|
||||
'description': 'md5:a4058c4fa2a804ab59c00d7244bbf62f',
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
'duration': 196,
|
||||
}
|
||||
}
|
||||
|
@ -5,19 +5,27 @@ import hashlib
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import compat_parse_qs
|
||||
from ..compat import (
|
||||
compat_parse_qs,
|
||||
compat_urlparse,
|
||||
)
|
||||
from ..utils import (
|
||||
ExtractorError,
|
||||
int_or_none,
|
||||
float_or_none,
|
||||
parse_iso8601,
|
||||
smuggle_url,
|
||||
strip_jsonp,
|
||||
unified_timestamp,
|
||||
unsmuggle_url,
|
||||
urlencode_postdata,
|
||||
)
|
||||
|
||||
|
||||
class BiliBiliIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.|bangumi\.|)bilibili\.(?:tv|com)/(?:video/av|anime/v/)(?P<id>\d+)'
|
||||
_VALID_URL = r'https?://(?:www\.|bangumi\.|)bilibili\.(?:tv|com)/(?:video/av|anime/(?P<anime_id>\d+)/play#)(?P<id>\d+)'
|
||||
|
||||
_TEST = {
|
||||
_TESTS = [{
|
||||
'url': 'http://www.bilibili.tv/video/av1074402/',
|
||||
'md5': '9fa226fe2b8a9a4d5a69b4c6a183417e',
|
||||
'info_dict': {
|
||||
@ -28,29 +36,65 @@ class BiliBiliIE(InfoExtractor):
|
||||
'duration': 308.315,
|
||||
'timestamp': 1398012660,
|
||||
'upload_date': '20140420',
|
||||
'thumbnail': 're:^https?://.+\.jpg',
|
||||
'thumbnail': r're:^https?://.+\.jpg',
|
||||
'uploader': '菊子桑',
|
||||
'uploader_id': '156160',
|
||||
},
|
||||
}
|
||||
}, {
|
||||
# Tested in BiliBiliBangumiIE
|
||||
'url': 'http://bangumi.bilibili.com/anime/1869/play#40062',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://bangumi.bilibili.com/anime/5802/play#100643',
|
||||
'md5': '3f721ad1e75030cc06faf73587cfec57',
|
||||
'info_dict': {
|
||||
'id': '100643',
|
||||
'ext': 'mp4',
|
||||
'title': 'CHAOS;CHILD',
|
||||
'description': '如果你是神明,并且能够让妄想成为现实。那你会进行怎么样的妄想?是淫靡的世界?独裁社会?毁灭性的制裁?还是……2015年,涩谷。从6年前发生的大灾害“涩谷地震”之后复兴了的这个街区里新设立的私立高中...',
|
||||
},
|
||||
'skip': 'Geo-restricted to China',
|
||||
}]
|
||||
|
||||
_APP_KEY = '6f90a59ac58a4123'
|
||||
_BILIBILI_KEY = '0bfd84cc3940035173f35e6777508326'
|
||||
_APP_KEY = '84956560bc028eb7'
|
||||
_BILIBILI_KEY = '94aba54af9065f71de72f5508f1cd42e'
|
||||
|
||||
def _report_error(self, result):
|
||||
if 'message' in result:
|
||||
raise ExtractorError('%s said: %s' % (self.IE_NAME, result['message']), expected=True)
|
||||
elif 'code' in result:
|
||||
raise ExtractorError('%s returns error %d' % (self.IE_NAME, result['code']), expected=True)
|
||||
else:
|
||||
raise ExtractorError('Can\'t extract Bangumi episode ID')
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
url, smuggled_data = unsmuggle_url(url, {})
|
||||
|
||||
mobj = re.match(self._VALID_URL, url)
|
||||
video_id = mobj.group('id')
|
||||
anime_id = mobj.group('anime_id')
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
if 'anime/v' not in url:
|
||||
if 'anime/' not in url:
|
||||
cid = compat_parse_qs(self._search_regex(
|
||||
[r'EmbedPlayer\([^)]+,\s*"([^"]+)"\)',
|
||||
r'<iframe[^>]+src="https://secure\.bilibili\.com/secure,([^"]+)"'],
|
||||
webpage, 'player parameters'))['cid'][0]
|
||||
else:
|
||||
if 'no_bangumi_tip' not in smuggled_data:
|
||||
self.to_screen('Downloading episode %s. To download all videos in anime %s, re-run youtube-dl with %s' % (
|
||||
video_id, anime_id, compat_urlparse.urljoin(url, '//bangumi.bilibili.com/anime/%s' % anime_id)))
|
||||
headers = {
|
||||
'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8',
|
||||
}
|
||||
headers.update(self.geo_verification_headers())
|
||||
|
||||
js = self._download_json(
|
||||
'http://bangumi.bilibili.com/web_api/get_source', video_id,
|
||||
data=urlencode_postdata({'episode_id': video_id}),
|
||||
headers={'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8'})
|
||||
headers=headers)
|
||||
if 'result' not in js:
|
||||
self._report_error(js)
|
||||
cid = js['result']['cid']
|
||||
|
||||
payload = 'appkey=%s&cid=%s&otype=json&quality=2&type=mp4' % (self._APP_KEY, cid)
|
||||
@ -58,7 +102,11 @@ class BiliBiliIE(InfoExtractor):
|
||||
|
||||
video_info = self._download_json(
|
||||
'http://interface.bilibili.com/playurl?%s&sign=%s' % (payload, sign),
|
||||
video_id, note='Downloading video info page')
|
||||
video_id, note='Downloading video info page',
|
||||
headers=self.geo_verification_headers())
|
||||
|
||||
if 'durl' not in video_info:
|
||||
self._report_error(video_info)
|
||||
|
||||
entries = []
|
||||
|
||||
@ -85,7 +133,7 @@ class BiliBiliIE(InfoExtractor):
|
||||
title = self._html_search_regex('<h1[^>]+title="([^"]+)">', webpage, 'title')
|
||||
description = self._html_search_meta('description', webpage)
|
||||
timestamp = unified_timestamp(self._html_search_regex(
|
||||
r'<time[^>]+datetime="([^"]+)"', webpage, 'upload time', fatal=False))
|
||||
r'<time[^>]+datetime="([^"]+)"', webpage, 'upload time', default=None))
|
||||
thumbnail = self._html_search_meta(['og:image', 'thumbnailUrl'], webpage)
|
||||
|
||||
# TODO 'view_count' requires deobfuscating Javascript
|
||||
@ -99,7 +147,7 @@ class BiliBiliIE(InfoExtractor):
|
||||
}
|
||||
|
||||
uploader_mobj = re.search(
|
||||
r'<a[^>]+href="https?://space\.bilibili\.com/(?P<id>\d+)"[^>]+title="(?P<name>[^"]+)"',
|
||||
r'<a[^>]+href="(?:https?:)?//space\.bilibili\.com/(?P<id>\d+)"[^>]+title="(?P<name>[^"]+)"',
|
||||
webpage)
|
||||
if uploader_mobj:
|
||||
info.update({
|
||||
@ -123,3 +171,70 @@ class BiliBiliIE(InfoExtractor):
|
||||
'description': description,
|
||||
'entries': entries,
|
||||
}
|
||||
|
||||
|
||||
class BiliBiliBangumiIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://bangumi\.bilibili\.com/anime/(?P<id>\d+)'
|
||||
|
||||
IE_NAME = 'bangumi.bilibili.com'
|
||||
IE_DESC = 'BiliBili番剧'
|
||||
|
||||
_TESTS = [{
|
||||
'url': 'http://bangumi.bilibili.com/anime/1869',
|
||||
'info_dict': {
|
||||
'id': '1869',
|
||||
'title': '混沌武士',
|
||||
'description': 'md5:6a9622b911565794c11f25f81d6a97d2',
|
||||
},
|
||||
'playlist_count': 26,
|
||||
}, {
|
||||
'url': 'http://bangumi.bilibili.com/anime/1869',
|
||||
'info_dict': {
|
||||
'id': '1869',
|
||||
'title': '混沌武士',
|
||||
'description': 'md5:6a9622b911565794c11f25f81d6a97d2',
|
||||
},
|
||||
'playlist': [{
|
||||
'md5': '91da8621454dd58316851c27c68b0c13',
|
||||
'info_dict': {
|
||||
'id': '40062',
|
||||
'ext': 'mp4',
|
||||
'title': '混沌武士',
|
||||
'description': '故事发生在日本的江户时代。风是一个小酒馆的打工女。一日,酒馆里来了一群恶霸,虽然他们的举动令风十分不满,但是毕竟风只是一届女流,无法对他们采取什么行动,只能在心里嘟哝。这时,酒家里又进来了个“不良份子...',
|
||||
'timestamp': 1414538739,
|
||||
'upload_date': '20141028',
|
||||
'episode': '疾风怒涛 Tempestuous Temperaments',
|
||||
'episode_number': 1,
|
||||
},
|
||||
}],
|
||||
'params': {
|
||||
'playlist_items': '1',
|
||||
},
|
||||
}]
|
||||
|
||||
@classmethod
|
||||
def suitable(cls, url):
|
||||
return False if BiliBiliIE.suitable(url) else super(BiliBiliBangumiIE, cls).suitable(url)
|
||||
|
||||
def _real_extract(self, url):
|
||||
bangumi_id = self._match_id(url)
|
||||
|
||||
# Sometimes this API returns a JSONP response
|
||||
season_info = self._download_json(
|
||||
'http://bangumi.bilibili.com/jsonp/seasoninfo/%s.ver' % bangumi_id,
|
||||
bangumi_id, transform_source=strip_jsonp)['result']
|
||||
|
||||
entries = [{
|
||||
'_type': 'url_transparent',
|
||||
'url': smuggle_url(episode['webplay_url'], {'no_bangumi_tip': 1}),
|
||||
'ie_key': BiliBiliIE.ie_key(),
|
||||
'timestamp': parse_iso8601(episode.get('update_time'), delimiter=' '),
|
||||
'episode': episode.get('index_title'),
|
||||
'episode_number': int_or_none(episode.get('index')),
|
||||
} for episode in season_info['episodes']]
|
||||
|
||||
entries = sorted(entries, key=lambda entry: entry.get('episode_number'))
|
||||
|
||||
return self.playlist_result(
|
||||
entries, bangumi_id,
|
||||
season_info.get('bangumi_title'), season_info.get('evaluate'))
|
||||
|
@ -19,7 +19,7 @@ class BioBioChileTVIE(InfoExtractor):
|
||||
'id': 'sobre-camaras-y-camarillas-parlamentarias',
|
||||
'ext': 'mp4',
|
||||
'title': 'Sobre Cámaras y camarillas parlamentarias',
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
'uploader': 'Fernando Atria',
|
||||
},
|
||||
'skip': 'URL expired and redirected to http://www.biobiochile.cl/portada/bbtv/index.html',
|
||||
@ -31,7 +31,7 @@ class BioBioChileTVIE(InfoExtractor):
|
||||
'id': 'natalia-valdebenito-repasa-a-diputado-hasbun-paso-a-la-categoria-de-hablar-brutalidades',
|
||||
'ext': 'mp4',
|
||||
'title': 'Natalia Valdebenito repasa a diputado Hasbún: Pasó a la categoría de hablar brutalidades',
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
'uploader': 'Piangella Obrador',
|
||||
},
|
||||
'params': {
|
||||
|
@ -33,6 +33,10 @@ class BloombergIE(InfoExtractor):
|
||||
'params': {
|
||||
'format': 'best[format_id^=hds]',
|
||||
},
|
||||
}, {
|
||||
# data-bmmrid=
|
||||
'url': 'https://www.bloomberg.com/politics/articles/2017-02-08/le-pen-aide-briefed-french-central-banker-on-plan-to-print-money',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://www.bloomberg.com/news/articles/2015-11-12/five-strange-things-that-have-been-happening-in-financial-markets',
|
||||
'only_matching': True,
|
||||
@ -45,9 +49,10 @@ class BloombergIE(InfoExtractor):
|
||||
name = self._match_id(url)
|
||||
webpage = self._download_webpage(url, name)
|
||||
video_id = self._search_regex(
|
||||
(r'["\']bmmrId["\']\s*:\s*(["\'])(?P<url>(?:(?!\1).)+)\1',
|
||||
r'videoId\s*:\s*(["\'])(?P<url>(?:(?!\1).)+)\1'),
|
||||
webpage, 'id', group='url', default=None)
|
||||
(r'["\']bmmrId["\']\s*:\s*(["\'])(?P<id>(?:(?!\1).)+)\1',
|
||||
r'videoId\s*:\s*(["\'])(?P<id>(?:(?!\1).)+)\1',
|
||||
r'data-bmmrid=(["\'])(?P<id>(?:(?!\1).)+)\1'),
|
||||
webpage, 'id', group='id', default=None)
|
||||
if not video_id:
|
||||
bplayer_data = self._parse_json(self._search_regex(
|
||||
r'BPlayer\(null,\s*({[^;]+})\);', webpage, 'id'), name)
|
||||
|
@ -1,9 +1,9 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
import json
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import compat_str
|
||||
from ..utils import (
|
||||
int_or_none,
|
||||
parse_age_limit,
|
||||
@ -11,7 +11,7 @@ from ..utils import (
|
||||
|
||||
|
||||
class BreakIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?break\.com/video/(?:[^/]+/)*.+-(?P<id>\d+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?(?P<site>break|screenjunkies)\.com/video/(?P<display_id>[^/]+?)(?:-(?P<id>\d+))?(?:[/?#&]|$)'
|
||||
_TESTS = [{
|
||||
'url': 'http://www.break.com/video/when-girls-act-like-guys-2468056',
|
||||
'info_dict': {
|
||||
@ -20,45 +20,124 @@ class BreakIE(InfoExtractor):
|
||||
'title': 'When Girls Act Like D-Bags',
|
||||
'age_limit': 13,
|
||||
}
|
||||
}, {
|
||||
'url': 'http://www.screenjunkies.com/video/best-quentin-tarantino-movie-2841915',
|
||||
'md5': '5c2b686bec3d43de42bde9ec047536b0',
|
||||
'info_dict': {
|
||||
'id': '2841915',
|
||||
'display_id': 'best-quentin-tarantino-movie',
|
||||
'ext': 'mp4',
|
||||
'title': 'Best Quentin Tarantino Movie',
|
||||
'thumbnail': r're:^https?://.*\.jpg',
|
||||
'duration': 3671,
|
||||
'age_limit': 13,
|
||||
'tags': list,
|
||||
},
|
||||
}, {
|
||||
'url': 'http://www.screenjunkies.com/video/honest-trailers-the-dark-knight',
|
||||
'info_dict': {
|
||||
'id': '2348808',
|
||||
'display_id': 'honest-trailers-the-dark-knight',
|
||||
'ext': 'mp4',
|
||||
'title': 'Honest Trailers - The Dark Knight',
|
||||
'thumbnail': r're:^https?://.*\.(?:jpg|png)',
|
||||
'age_limit': 10,
|
||||
'tags': list,
|
||||
},
|
||||
}, {
|
||||
# requires subscription but worked around
|
||||
'url': 'http://www.screenjunkies.com/video/knocking-dead-ep-1-the-show-so-far-3003285',
|
||||
'info_dict': {
|
||||
'id': '3003285',
|
||||
'display_id': 'knocking-dead-ep-1-the-show-so-far',
|
||||
'ext': 'mp4',
|
||||
'title': 'State of The Dead Recap: Knocking Dead Pilot',
|
||||
'thumbnail': r're:^https?://.*\.jpg',
|
||||
'duration': 3307,
|
||||
'age_limit': 13,
|
||||
'tags': list,
|
||||
},
|
||||
}, {
|
||||
'url': 'http://www.break.com/video/ugc/baby-flex-2773063',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
webpage = self._download_webpage(
|
||||
'http://www.break.com/embed/%s' % video_id, video_id)
|
||||
info = json.loads(self._search_regex(
|
||||
r'var embedVars = ({.*})\s*?</script>',
|
||||
webpage, 'info json', flags=re.DOTALL))
|
||||
_DEFAULT_BITRATES = (48, 150, 320, 496, 864, 2240, 3264)
|
||||
|
||||
youtube_id = info.get('youtubeId')
|
||||
def _real_extract(self, url):
|
||||
site, display_id, video_id = re.match(self._VALID_URL, url).groups()
|
||||
|
||||
if not video_id:
|
||||
webpage = self._download_webpage(url, display_id)
|
||||
video_id = self._search_regex(
|
||||
(r'src=["\']/embed/(\d+)', r'data-video-content-id=["\'](\d+)'),
|
||||
webpage, 'video id')
|
||||
|
||||
webpage = self._download_webpage(
|
||||
'http://www.%s.com/embed/%s' % (site, video_id),
|
||||
display_id, 'Downloading video embed page')
|
||||
embed_vars = self._parse_json(
|
||||
self._search_regex(
|
||||
r'(?s)embedVars\s*=\s*({.+?})\s*</script>', webpage, 'embed vars'),
|
||||
display_id)
|
||||
|
||||
youtube_id = embed_vars.get('youtubeId')
|
||||
if youtube_id:
|
||||
return self.url_result(youtube_id, 'Youtube')
|
||||
|
||||
formats = [{
|
||||
'url': media['uri'] + '?' + info['AuthToken'],
|
||||
'tbr': media['bitRate'],
|
||||
'width': media['width'],
|
||||
'height': media['height'],
|
||||
} for media in info['media'] if media.get('mediaPurpose') == 'play']
|
||||
title = embed_vars['contentName']
|
||||
|
||||
if not formats:
|
||||
formats = []
|
||||
bitrates = []
|
||||
for f in embed_vars.get('media', []):
|
||||
if not f.get('uri') or f.get('mediaPurpose') != 'play':
|
||||
continue
|
||||
bitrate = int_or_none(f.get('bitRate'))
|
||||
if bitrate:
|
||||
bitrates.append(bitrate)
|
||||
formats.append({
|
||||
'url': info['videoUri']
|
||||
'url': f['uri'],
|
||||
'format_id': 'http-%d' % bitrate if bitrate else 'http',
|
||||
'width': int_or_none(f.get('width')),
|
||||
'height': int_or_none(f.get('height')),
|
||||
'tbr': bitrate,
|
||||
'format': 'mp4',
|
||||
})
|
||||
|
||||
self._sort_formats(formats)
|
||||
if not bitrates:
|
||||
# When subscriptionLevel > 0, i.e. plus subscription is required
|
||||
# media list will be empty. However, hds and hls uris are still
|
||||
# available. We can grab them assuming bitrates to be default.
|
||||
bitrates = self._DEFAULT_BITRATES
|
||||
|
||||
duration = int_or_none(info.get('videoLengthInSeconds'))
|
||||
age_limit = parse_age_limit(info.get('audienceRating'))
|
||||
auth_token = embed_vars.get('AuthToken')
|
||||
|
||||
def construct_manifest_url(base_url, ext):
|
||||
pieces = [base_url]
|
||||
pieces.extend([compat_str(b) for b in bitrates])
|
||||
pieces.append('_kbps.mp4.%s?%s' % (ext, auth_token))
|
||||
return ','.join(pieces)
|
||||
|
||||
if bitrates and auth_token:
|
||||
hds_url = embed_vars.get('hdsUri')
|
||||
if hds_url:
|
||||
formats.extend(self._extract_f4m_formats(
|
||||
construct_manifest_url(hds_url, 'f4m'),
|
||||
display_id, f4m_id='hds', fatal=False))
|
||||
hls_url = embed_vars.get('hlsUri')
|
||||
if hls_url:
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
construct_manifest_url(hls_url, 'm3u8'),
|
||||
display_id, 'mp4', entry_protocol='m3u8_native', m3u8_id='hls', fatal=False))
|
||||
self._sort_formats(formats)
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': info['contentName'],
|
||||
'thumbnail': info['thumbUri'],
|
||||
'duration': duration,
|
||||
'age_limit': age_limit,
|
||||
'display_id': display_id,
|
||||
'title': title,
|
||||
'thumbnail': embed_vars.get('thumbUri'),
|
||||
'duration': int_or_none(embed_vars.get('videoLengthInSeconds')) or None,
|
||||
'age_limit': parse_age_limit(embed_vars.get('audienceRating')),
|
||||
'tags': embed_vars.get('tags', '').split(','),
|
||||
'formats': formats,
|
||||
}
|
||||
|
@ -179,7 +179,7 @@ class BrightcoveLegacyIE(InfoExtractor):
|
||||
|
||||
params = {}
|
||||
|
||||
playerID = find_param('playerID')
|
||||
playerID = find_param('playerID') or find_param('playerId')
|
||||
if playerID is None:
|
||||
raise ExtractorError('Cannot find player ID')
|
||||
params['playerID'] = playerID
|
||||
@ -191,6 +191,16 @@ class BrightcoveLegacyIE(InfoExtractor):
|
||||
# These fields hold the id of the video
|
||||
videoPlayer = find_param('@videoPlayer') or find_param('videoId') or find_param('videoID') or find_param('@videoList')
|
||||
if videoPlayer is not None:
|
||||
if isinstance(videoPlayer, list):
|
||||
videoPlayer = videoPlayer[0]
|
||||
videoPlayer = videoPlayer.strip()
|
||||
# UUID is also possible for videoPlayer (e.g.
|
||||
# http://www.popcornflix.com/hoodies-vs-hooligans/7f2d2b87-bbf2-4623-acfb-ea942b4f01dd
|
||||
# or http://www8.hp.com/cn/zh/home.html)
|
||||
if not (re.match(
|
||||
r'^(?:\d+|[\da-fA-F]{8}-?[\da-fA-F]{4}-?[\da-fA-F]{4}-?[\da-fA-F]{4}-?[\da-fA-F]{12})$',
|
||||
videoPlayer) or videoPlayer.startswith('ref:')):
|
||||
return None
|
||||
params['@videoPlayer'] = videoPlayer
|
||||
linkBase = find_param('linkBaseURL')
|
||||
if linkBase is not None:
|
||||
@ -204,7 +214,7 @@ class BrightcoveLegacyIE(InfoExtractor):
|
||||
# // build Brightcove <object /> XML
|
||||
# }
|
||||
m = re.search(
|
||||
r'''(?x)customBC.\createVideo\(
|
||||
r'''(?x)customBC\.createVideo\(
|
||||
.*? # skipping width and height
|
||||
["\'](?P<playerID>\d+)["\']\s*,\s* # playerID
|
||||
["\'](?P<playerKey>AQ[^"\']{48})[^"\']*["\']\s*,\s* # playerKey begins with AQ and is 50 characters
|
||||
@ -232,13 +242,16 @@ class BrightcoveLegacyIE(InfoExtractor):
|
||||
"""Return a list of all Brightcove URLs from the webpage """
|
||||
|
||||
url_m = re.search(
|
||||
r'<meta\s+property=[\'"]og:video[\'"]\s+content=[\'"](https?://(?:secure|c)\.brightcove.com/[^\'"]+)[\'"]',
|
||||
webpage)
|
||||
r'''(?x)
|
||||
<meta\s+
|
||||
(?:property|itemprop)=([\'"])(?:og:video|embedURL)\1[^>]+
|
||||
content=([\'"])(?P<url>https?://(?:secure|c)\.brightcove.com/(?:(?!\2).)+)\2
|
||||
''', webpage)
|
||||
if url_m:
|
||||
url = unescapeHTML(url_m.group(1))
|
||||
url = unescapeHTML(url_m.group('url'))
|
||||
# Some sites don't add it, we can't download with this url, for example:
|
||||
# http://www.ktvu.com/videos/news/raw-video-caltrain-releases-video-of-man-almost/vCTZdY/
|
||||
if 'playerKey' in url or 'videoId' in url:
|
||||
if 'playerKey' in url or 'videoId' in url or 'idVideo' in url:
|
||||
return [url]
|
||||
|
||||
matches = re.findall(
|
||||
@ -259,7 +272,7 @@ class BrightcoveLegacyIE(InfoExtractor):
|
||||
url, smuggled_data = unsmuggle_url(url, {})
|
||||
|
||||
# Change the 'videoId' and others field to '@videoPlayer'
|
||||
url = re.sub(r'(?<=[?&])(videoI(d|D)|bctid)', '%40videoPlayer', url)
|
||||
url = re.sub(r'(?<=[?&])(videoI(d|D)|idVideo|bctid)', '%40videoPlayer', url)
|
||||
# Change bckey (used by bcove.me urls) to playerKey
|
||||
url = re.sub(r'(?<=[?&])bckey', 'playerKey', url)
|
||||
mobj = re.match(self._VALID_URL, url)
|
||||
@ -508,6 +521,9 @@ class BrightcoveNewIE(InfoExtractor):
|
||||
return entries
|
||||
|
||||
def _real_extract(self, url):
|
||||
url, smuggled_data = unsmuggle_url(url, {})
|
||||
self._initialize_geo_bypass(smuggled_data.get('geo_countries'))
|
||||
|
||||
account_id, player_id, embed, video_id = re.match(self._VALID_URL, url).groups()
|
||||
|
||||
webpage = self._download_webpage(
|
||||
@ -537,8 +553,10 @@ class BrightcoveNewIE(InfoExtractor):
|
||||
except ExtractorError as e:
|
||||
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 403:
|
||||
json_data = self._parse_json(e.cause.read().decode(), video_id)[0]
|
||||
raise ExtractorError(
|
||||
json_data.get('message') or json_data['error_code'], expected=True)
|
||||
message = json_data.get('message') or json_data['error_code']
|
||||
if json_data.get('error_subcode') == 'CLIENT_GEO':
|
||||
self.raise_geo_restricted(msg=message)
|
||||
raise ExtractorError(message, expected=True)
|
||||
raise
|
||||
|
||||
title = json_data['name'].strip()
|
||||
@ -548,7 +566,7 @@ class BrightcoveNewIE(InfoExtractor):
|
||||
container = source.get('container')
|
||||
ext = mimetype2ext(source.get('type'))
|
||||
src = source.get('src')
|
||||
if ext == 'ism':
|
||||
if ext == 'ism' or container == 'WVM':
|
||||
continue
|
||||
elif ext == 'm3u8' or container == 'M2TS':
|
||||
if not src:
|
||||
|
@ -16,7 +16,7 @@ class BYUtvIE(InfoExtractor):
|
||||
'ext': 'mp4',
|
||||
'title': 'Season 5 Episode 5',
|
||||
'description': 'md5:e07269172baff037f8e8bf9956bc9747',
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
'duration': 1486.486,
|
||||
},
|
||||
'params': {
|
||||
|
@ -26,7 +26,7 @@ class CamdemyIE(InfoExtractor):
|
||||
'id': '5181',
|
||||
'ext': 'mp4',
|
||||
'title': 'Ch1-1 Introduction, Signals (02-23-2012)',
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
'creator': 'ss11spring',
|
||||
'duration': 1591,
|
||||
'upload_date': '20130114',
|
||||
@ -41,7 +41,7 @@ class CamdemyIE(InfoExtractor):
|
||||
'id': '13885',
|
||||
'ext': 'mp4',
|
||||
'title': 'EverCam + Camdemy QuickStart',
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
'description': 'md5:2a9f989c2b153a2342acee579c6e7db6',
|
||||
'creator': 'evercam',
|
||||
'duration': 318,
|
||||
|
@ -27,6 +27,7 @@ class CanalplusIE(InfoExtractor):
|
||||
(?:www\.)?d8\.tv|
|
||||
(?:www\.)?c8\.fr|
|
||||
(?:www\.)?d17\.tv|
|
||||
(?:(?:football|www)\.)?cstar\.fr|
|
||||
(?:www\.)?itele\.fr
|
||||
)/(?:(?:[^/]+/)*(?P<display_id>[^/?#&]+))?(?:\?.*\bvid=(?P<vid>\d+))?|
|
||||
player\.canalplus\.fr/#/(?P<id>\d+)
|
||||
@ -40,6 +41,7 @@ class CanalplusIE(InfoExtractor):
|
||||
'd8': 'd8',
|
||||
'c8': 'd8',
|
||||
'd17': 'd17',
|
||||
'cstar': 'd17',
|
||||
'itele': 'itele',
|
||||
}
|
||||
|
||||
@ -86,6 +88,19 @@ class CanalplusIE(InfoExtractor):
|
||||
'description': 'Chaque matin du lundi au vendredi, Michaël Darmon reçoit un invité politique à 8h25.',
|
||||
'upload_date': '20161014',
|
||||
},
|
||||
}, {
|
||||
'url': 'http://football.cstar.fr/cstar-minisite-foot/pid7566-feminines-videos.html?vid=1416769',
|
||||
'info_dict': {
|
||||
'id': '1416769',
|
||||
'display_id': 'pid7566-feminines-videos',
|
||||
'ext': 'mp4',
|
||||
'title': 'France - Albanie : les temps forts de la soirée - 20/09/2016',
|
||||
'description': 'md5:c3f30f2aaac294c1c969b3294de6904e',
|
||||
'upload_date': '20160921',
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
},
|
||||
}, {
|
||||
'url': 'http://m.canalplus.fr/?vid=1398231',
|
||||
'only_matching': True,
|
||||
@ -107,7 +122,7 @@ class CanalplusIE(InfoExtractor):
|
||||
[r'<canal:player[^>]+?videoId=(["\'])(?P<id>\d+)',
|
||||
r'id=["\']canal_video_player(?P<id>\d+)',
|
||||
r'data-video=["\'](?P<id>\d+)'],
|
||||
webpage, 'video id', group='id')
|
||||
webpage, 'video id', default=mobj.group('vid'), group='id')
|
||||
|
||||
info_url = self._VIDEO_INFO_TEMPLATE % (site_id, video_id)
|
||||
video_data = self._download_json(info_url, video_id, 'Downloading video JSON')
|
||||
|
@ -17,7 +17,7 @@ class CanvasIE(InfoExtractor):
|
||||
'ext': 'mp4',
|
||||
'title': 'De afspraak veilt voor de Warmste Week',
|
||||
'description': 'md5:24cb860c320dc2be7358e0e5aa317ba6',
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
'duration': 49.02,
|
||||
}
|
||||
}, {
|
||||
@ -29,7 +29,7 @@ class CanvasIE(InfoExtractor):
|
||||
'ext': 'mp4',
|
||||
'title': 'Pieter 0167',
|
||||
'description': 'md5:943cd30f48a5d29ba02c3a104dc4ec4e',
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
'duration': 2553.08,
|
||||
'subtitles': {
|
||||
'nl': [{
|
||||
@ -48,7 +48,7 @@ class CanvasIE(InfoExtractor):
|
||||
'ext': 'mp4',
|
||||
'title': 'Herbekijk Sorry voor alles',
|
||||
'description': 'md5:8bb2805df8164e5eb95d6a7a29dc0dd3',
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
'duration': 3788.06,
|
||||
},
|
||||
'params': {
|
||||
|
@ -21,7 +21,7 @@ class CarambaTVIE(InfoExtractor):
|
||||
'id': '191910501',
|
||||
'ext': 'mp4',
|
||||
'title': '[BadComedian] - Разборка в Маниле (Абсолютный обзор)',
|
||||
'thumbnail': 're:^https?://.*\.jpg',
|
||||
'thumbnail': r're:^https?://.*\.jpg',
|
||||
'duration': 2678.31,
|
||||
},
|
||||
}, {
|
||||
@ -69,7 +69,7 @@ class CarambaTVPageIE(InfoExtractor):
|
||||
'id': '475222',
|
||||
'ext': 'flv',
|
||||
'title': '[BadComedian] - Разборка в Маниле (Абсолютный обзор)',
|
||||
'thumbnail': 're:^https?://.*\.jpg',
|
||||
'thumbnail': r're:^https?://.*\.jpg',
|
||||
# duration reported by videomore is incorrect
|
||||
'duration': int,
|
||||
},
|
||||
|
@ -90,36 +90,49 @@ class CBCIE(InfoExtractor):
|
||||
},
|
||||
}],
|
||||
'skip': 'Geo-restricted to Canada',
|
||||
}, {
|
||||
# multiple CBC.APP.Caffeine.initInstance(...)
|
||||
'url': 'http://www.cbc.ca/news/canada/calgary/dog-indoor-exercise-winter-1.3928238',
|
||||
'info_dict': {
|
||||
'title': 'Keep Rover active during the deep freeze with doggie pushups and other fun indoor tasks',
|
||||
'id': 'dog-indoor-exercise-winter-1.3928238',
|
||||
},
|
||||
'playlist_mincount': 6,
|
||||
}]
|
||||
|
||||
@classmethod
|
||||
def suitable(cls, url):
|
||||
return False if CBCPlayerIE.suitable(url) else super(CBCIE, cls).suitable(url)
|
||||
|
||||
def _extract_player_init(self, player_init, display_id):
|
||||
player_info = self._parse_json(player_init, display_id, js_to_json)
|
||||
media_id = player_info.get('mediaId')
|
||||
if not media_id:
|
||||
clip_id = player_info['clipId']
|
||||
feed = self._download_json(
|
||||
'http://tpfeed.cbc.ca/f/ExhSPC/vms_5akSXx4Ng_Zn?byCustomValue={:mpsReleases}{%s}' % clip_id,
|
||||
clip_id, fatal=False)
|
||||
if feed:
|
||||
media_id = try_get(feed, lambda x: x['entries'][0]['guid'], compat_str)
|
||||
if not media_id:
|
||||
media_id = self._download_json(
|
||||
'http://feed.theplatform.com/f/h9dtGB/punlNGjMlc1F?fields=id&byContent=byReleases%3DbyId%253D' + clip_id,
|
||||
clip_id)['entries'][0]['id'].split('/')[-1]
|
||||
return self.url_result('cbcplayer:%s' % media_id, 'CBCPlayer', media_id)
|
||||
|
||||
def _real_extract(self, url):
|
||||
display_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, display_id)
|
||||
player_init = self._search_regex(
|
||||
r'CBC\.APP\.Caffeine\.initInstance\(({.+?})\);', webpage, 'player init',
|
||||
default=None)
|
||||
if player_init:
|
||||
player_info = self._parse_json(player_init, display_id, js_to_json)
|
||||
media_id = player_info.get('mediaId')
|
||||
if not media_id:
|
||||
clip_id = player_info['clipId']
|
||||
feed = self._download_json(
|
||||
'http://tpfeed.cbc.ca/f/ExhSPC/vms_5akSXx4Ng_Zn?byCustomValue={:mpsReleases}{%s}' % clip_id,
|
||||
clip_id, fatal=False)
|
||||
if feed:
|
||||
media_id = try_get(feed, lambda x: x['entries'][0]['guid'], compat_str)
|
||||
if not media_id:
|
||||
media_id = self._download_json(
|
||||
'http://feed.theplatform.com/f/h9dtGB/punlNGjMlc1F?fields=id&byContent=byReleases%3DbyId%253D' + clip_id,
|
||||
clip_id)['entries'][0]['id'].split('/')[-1]
|
||||
return self.url_result('cbcplayer:%s' % media_id, 'CBCPlayer', media_id)
|
||||
else:
|
||||
entries = [self.url_result('cbcplayer:%s' % media_id, 'CBCPlayer', media_id) for media_id in re.findall(r'<iframe[^>]+src="[^"]+?mediaId=(\d+)"', webpage)]
|
||||
return self.playlist_result(entries)
|
||||
entries = [
|
||||
self._extract_player_init(player_init, display_id)
|
||||
for player_init in re.findall(r'CBC\.APP\.Caffeine\.initInstance\(({.+?})\);', webpage)]
|
||||
entries.extend([
|
||||
self.url_result('cbcplayer:%s' % media_id, 'CBCPlayer', media_id)
|
||||
for media_id in re.findall(r'<iframe[^>]+src="[^"]+?mediaId=(\d+)"', webpage)])
|
||||
return self.playlist_result(
|
||||
entries, display_id,
|
||||
self._og_search_title(webpage, fatal=False),
|
||||
self._og_search_description(webpage))
|
||||
|
||||
|
||||
class CBCPlayerIE(InfoExtractor):
|
||||
@ -283,6 +296,12 @@ class CBCWatchVideoIE(CBCWatchBaseIE):
|
||||
formats = self._extract_m3u8_formats(re.sub(r'/([^/]+)/[^/?]+\.m3u8', r'/\1/\1.m3u8', m3u8_url), video_id, 'mp4', fatal=False)
|
||||
if len(formats) < 2:
|
||||
formats = self._extract_m3u8_formats(m3u8_url, video_id, 'mp4')
|
||||
for f in formats:
|
||||
format_id = f.get('format_id')
|
||||
if format_id.startswith('AAC'):
|
||||
f['acodec'] = 'aac'
|
||||
elif format_id.startswith('AC3'):
|
||||
f['acodec'] = 'ac-3'
|
||||
self._sort_formats(formats)
|
||||
|
||||
info = {
|
||||
|
@ -39,7 +39,7 @@ class CBSNewsIE(CBSIE):
|
||||
'upload_date': '20140404',
|
||||
'timestamp': 1396650660,
|
||||
'uploader': 'CBSI-NEW',
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
'duration': 205,
|
||||
'subtitles': {
|
||||
'en': [{
|
||||
|
@ -19,7 +19,7 @@ class CCCIE(InfoExtractor):
|
||||
'ext': 'mp4',
|
||||
'title': 'Introduction to Processor Design',
|
||||
'description': 'md5:df55f6d073d4ceae55aae6f2fd98a0ac',
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
'upload_date': '20131228',
|
||||
'timestamp': 1388188800,
|
||||
'duration': 3710,
|
||||
@ -32,7 +32,7 @@ class CCCIE(InfoExtractor):
|
||||
def _real_extract(self, url):
|
||||
display_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, display_id)
|
||||
event_id = self._search_regex("data-id='(\d+)'", webpage, 'event id')
|
||||
event_id = self._search_regex(r"data-id='(\d+)'", webpage, 'event id')
|
||||
event_data = self._download_json('https://media.ccc.de/public/events/%s' % event_id, event_id)
|
||||
|
||||
formats = []
|
||||
|
99
youtube_dl/extractor/ccma.py
Normal file
99
youtube_dl/extractor/ccma.py
Normal file
@ -0,0 +1,99 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
int_or_none,
|
||||
parse_duration,
|
||||
parse_iso8601,
|
||||
clean_html,
|
||||
)
|
||||
|
||||
|
||||
class CCMAIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?ccma\.cat/(?:[^/]+/)*?(?P<type>video|audio)/(?P<id>\d+)'
|
||||
_TESTS = [{
|
||||
'url': 'http://www.ccma.cat/tv3/alacarta/lespot-de-la-marato-de-tv3/lespot-de-la-marato-de-tv3/video/5630208/',
|
||||
'md5': '7296ca43977c8ea4469e719c609b0871',
|
||||
'info_dict': {
|
||||
'id': '5630208',
|
||||
'ext': 'mp4',
|
||||
'title': 'L\'espot de La Marató de TV3',
|
||||
'description': 'md5:f12987f320e2f6e988e9908e4fe97765',
|
||||
'timestamp': 1470918540,
|
||||
'upload_date': '20160811',
|
||||
}
|
||||
}, {
|
||||
'url': 'http://www.ccma.cat/catradio/alacarta/programa/el-consell-de-savis-analitza-el-derbi/audio/943685/',
|
||||
'md5': 'fa3e38f269329a278271276330261425',
|
||||
'info_dict': {
|
||||
'id': '943685',
|
||||
'ext': 'mp3',
|
||||
'title': 'El Consell de Savis analitza el derbi',
|
||||
'description': 'md5:e2a3648145f3241cb9c6b4b624033e53',
|
||||
'upload_date': '20171205',
|
||||
'timestamp': 1512507300,
|
||||
}
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
media_type, media_id = re.match(self._VALID_URL, url).groups()
|
||||
media_data = {}
|
||||
formats = []
|
||||
profiles = ['pc'] if media_type == 'audio' else ['mobil', 'pc']
|
||||
for i, profile in enumerate(profiles):
|
||||
md = self._download_json('http://dinamics.ccma.cat/pvideo/media.jsp', media_id, query={
|
||||
'media': media_type,
|
||||
'idint': media_id,
|
||||
'profile': profile,
|
||||
}, fatal=False)
|
||||
if md:
|
||||
media_data = md
|
||||
media_url = media_data.get('media', {}).get('url')
|
||||
if media_url:
|
||||
formats.append({
|
||||
'format_id': profile,
|
||||
'url': media_url,
|
||||
'quality': i,
|
||||
})
|
||||
self._sort_formats(formats)
|
||||
|
||||
informacio = media_data['informacio']
|
||||
title = informacio['titol']
|
||||
durada = informacio.get('durada', {})
|
||||
duration = int_or_none(durada.get('milisegons'), 1000) or parse_duration(durada.get('text'))
|
||||
timestamp = parse_iso8601(informacio.get('data_emissio', {}).get('utc'))
|
||||
|
||||
subtitles = {}
|
||||
subtitols = media_data.get('subtitols', {})
|
||||
if subtitols:
|
||||
sub_url = subtitols.get('url')
|
||||
if sub_url:
|
||||
subtitles.setdefault(
|
||||
subtitols.get('iso') or subtitols.get('text') or 'ca', []).append({
|
||||
'url': sub_url,
|
||||
})
|
||||
|
||||
thumbnails = []
|
||||
imatges = media_data.get('imatges', {})
|
||||
if imatges:
|
||||
thumbnail_url = imatges.get('url')
|
||||
if thumbnail_url:
|
||||
thumbnails = [{
|
||||
'url': thumbnail_url,
|
||||
'width': int_or_none(imatges.get('amplada')),
|
||||
'height': int_or_none(imatges.get('alcada')),
|
||||
}]
|
||||
|
||||
return {
|
||||
'id': media_id,
|
||||
'title': title,
|
||||
'description': clean_html(informacio.get('descripcio')),
|
||||
'duration': duration,
|
||||
'timestamp': timestamp,
|
||||
'thumnails': thumbnails,
|
||||
'subtitles': subtitles,
|
||||
'formats': formats,
|
||||
}
|
@ -4,50 +4,188 @@ from __future__ import unicode_literals
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import float_or_none
|
||||
from ..compat import compat_str
|
||||
from ..utils import (
|
||||
float_or_none,
|
||||
try_get,
|
||||
unified_timestamp,
|
||||
)
|
||||
|
||||
|
||||
class CCTVIE(InfoExtractor):
|
||||
_VALID_URL = r'''(?x)https?://(?:.+?\.)?
|
||||
(?:
|
||||
cctv\.(?:com|cn)|
|
||||
cntv\.cn
|
||||
)/
|
||||
(?:
|
||||
video/[^/]+/(?P<id>[0-9a-f]{32})|
|
||||
\d{4}/\d{2}/\d{2}/(?P<display_id>VID[0-9A-Za-z]+)
|
||||
)'''
|
||||
IE_DESC = '央视网'
|
||||
_VALID_URL = r'https?://(?:(?:[^/]+)\.(?:cntv|cctv)\.(?:com|cn)|(?:www\.)?ncpa-classic\.com)/(?:[^/]+/)*?(?P<id>[^/?#&]+?)(?:/index)?(?:\.s?html|[?#&]|$)'
|
||||
_TESTS = [{
|
||||
'url': 'http://english.cntv.cn/2016/09/03/VIDEhnkB5y9AgHyIEVphCEz1160903.shtml',
|
||||
'md5': '819c7b49fc3927d529fb4cd555621823',
|
||||
# fo.addVariable("videoCenterId","id")
|
||||
'url': 'http://sports.cntv.cn/2016/02/12/ARTIaBRxv4rTT1yWf1frW2wi160212.shtml',
|
||||
'md5': 'd61ec00a493e09da810bf406a078f691',
|
||||
'info_dict': {
|
||||
'id': '454368eb19ad44a1925bf1eb96140a61',
|
||||
'id': '5ecdbeab623f4973b40ff25f18b174e8',
|
||||
'ext': 'mp4',
|
||||
'title': 'Portrait of Real Current Life 09/03/2016 Modern Inventors Part 1',
|
||||
}
|
||||
'title': '[NBA]二少联手砍下46分 雷霆主场击败鹈鹕(快讯)',
|
||||
'description': 'md5:7e14a5328dc5eb3d1cd6afbbe0574e95',
|
||||
'duration': 98,
|
||||
'uploader': 'songjunjie',
|
||||
'timestamp': 1455279956,
|
||||
'upload_date': '20160212',
|
||||
},
|
||||
}, {
|
||||
# var guid = "id"
|
||||
'url': 'http://tv.cctv.com/2016/02/05/VIDEUS7apq3lKrHG9Dncm03B160205.shtml',
|
||||
'info_dict': {
|
||||
'id': 'efc5d49e5b3b4ab2b34f3a502b73d3ae',
|
||||
'ext': 'mp4',
|
||||
'title': '[赛车]“车王”舒马赫恢复情况成谜(快讯)',
|
||||
'description': '2月4日,蒙特泽莫罗透露了关于“车王”舒马赫恢复情况,但情况是否属实遭到了质疑。',
|
||||
'duration': 37,
|
||||
'uploader': 'shujun',
|
||||
'timestamp': 1454677291,
|
||||
'upload_date': '20160205',
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
},
|
||||
}, {
|
||||
# changePlayer('id')
|
||||
'url': 'http://english.cntv.cn/special/four_comprehensives/index.shtml',
|
||||
'info_dict': {
|
||||
'id': '4bb9bb4db7a6471ba85fdeda5af0381e',
|
||||
'ext': 'mp4',
|
||||
'title': 'NHnews008 ANNUAL POLITICAL SEASON',
|
||||
'description': 'Four Comprehensives',
|
||||
'duration': 60,
|
||||
'uploader': 'zhangyunlei',
|
||||
'timestamp': 1425385521,
|
||||
'upload_date': '20150303',
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
},
|
||||
}, {
|
||||
# loadvideo('id')
|
||||
'url': 'http://cctv.cntv.cn/lm/tvseries_russian/yilugesanghua/index.shtml',
|
||||
'info_dict': {
|
||||
'id': 'b15f009ff45c43968b9af583fc2e04b2',
|
||||
'ext': 'mp4',
|
||||
'title': 'Путь,усыпанный космеями Серия 1',
|
||||
'description': 'Путь, усыпанный космеями',
|
||||
'duration': 2645,
|
||||
'uploader': 'renxue',
|
||||
'timestamp': 1477479241,
|
||||
'upload_date': '20161026',
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
},
|
||||
}, {
|
||||
# var initMyAray = 'id'
|
||||
'url': 'http://www.ncpa-classic.com/2013/05/22/VIDE1369219508996867.shtml',
|
||||
'info_dict': {
|
||||
'id': 'a194cfa7f18c426b823d876668325946',
|
||||
'ext': 'mp4',
|
||||
'title': '小泽征尔音乐塾 音乐梦想无国界',
|
||||
'duration': 2173,
|
||||
'timestamp': 1369248264,
|
||||
'upload_date': '20130522',
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
},
|
||||
}, {
|
||||
# var ids = ["id"]
|
||||
'url': 'http://www.ncpa-classic.com/clt/more/416/index.shtml',
|
||||
'info_dict': {
|
||||
'id': 'a8606119a4884588a79d81c02abecc16',
|
||||
'ext': 'mp3',
|
||||
'title': '来自维也纳的新年贺礼',
|
||||
'description': 'md5:f13764ae8dd484e84dd4b39d5bcba2a7',
|
||||
'duration': 1578,
|
||||
'uploader': 'djy',
|
||||
'timestamp': 1482942419,
|
||||
'upload_date': '20161228',
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
},
|
||||
'expected_warnings': ['Failed to download m3u8 information'],
|
||||
}, {
|
||||
'url': 'http://ent.cntv.cn/2016/01/18/ARTIjprSSJH8DryTVr5Bx8Wb160118.shtml',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://tv.cntv.cn/video/C39296/e0210d949f113ddfb38d31f00a4e5c44',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://english.cntv.cn/2016/09/03/VIDEhnkB5y9AgHyIEVphCEz1160903.shtml',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://tv.cctv.com/2016/09/07/VIDE5C1FnlX5bUywlrjhxXOV160907.shtml',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://tv.cntv.cn/video/C39296/95cfac44cabd3ddc4a9438780a4e5c44',
|
||||
'only_matching': True
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id, display_id = re.match(self._VALID_URL, url).groups()
|
||||
if not video_id:
|
||||
webpage = self._download_webpage(url, display_id)
|
||||
video_id = self._search_regex(
|
||||
r'(?:fo\.addVariable\("videoCenterId",\s*|guid\s*=\s*)"([0-9a-f]{32})',
|
||||
webpage, 'video_id')
|
||||
api_data = self._download_json(
|
||||
'http://vdn.apps.cntv.cn/api/getHttpVideoInfo.do?pid=' + video_id, video_id)
|
||||
m3u8_url = re.sub(r'maxbr=\d+&?', '', api_data['hls_url'])
|
||||
video_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
video_id = self._search_regex(
|
||||
[r'var\s+guid\s*=\s*["\']([\da-fA-F]+)',
|
||||
r'videoCenterId["\']\s*,\s*["\']([\da-fA-F]+)',
|
||||
r'changePlayer\s*\(\s*["\']([\da-fA-F]+)',
|
||||
r'load[Vv]ideo\s*\(\s*["\']([\da-fA-F]+)',
|
||||
r'var\s+initMyAray\s*=\s*["\']([\da-fA-F]+)',
|
||||
r'var\s+ids\s*=\s*\[["\']([\da-fA-F]+)'],
|
||||
webpage, 'video id')
|
||||
|
||||
data = self._download_json(
|
||||
'http://vdn.apps.cntv.cn/api/getHttpVideoInfo.do', video_id,
|
||||
query={
|
||||
'pid': video_id,
|
||||
'url': url,
|
||||
'idl': 32,
|
||||
'idlr': 32,
|
||||
'modifyed': 'false',
|
||||
})
|
||||
|
||||
title = data['title']
|
||||
|
||||
formats = []
|
||||
|
||||
video = data.get('video')
|
||||
if isinstance(video, dict):
|
||||
for quality, chapters_key in enumerate(('lowChapters', 'chapters')):
|
||||
video_url = try_get(
|
||||
video, lambda x: x[chapters_key][0]['url'], compat_str)
|
||||
if video_url:
|
||||
formats.append({
|
||||
'url': video_url,
|
||||
'format_id': 'http',
|
||||
'quality': quality,
|
||||
'preference': -1,
|
||||
})
|
||||
|
||||
hls_url = try_get(data, lambda x: x['hls_url'], compat_str)
|
||||
if hls_url:
|
||||
hls_url = re.sub(r'maxbr=\d+&?', '', hls_url)
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
hls_url, video_id, 'mp4', entry_protocol='m3u8_native',
|
||||
m3u8_id='hls', fatal=False))
|
||||
|
||||
self._sort_formats(formats)
|
||||
|
||||
uploader = data.get('editer_name')
|
||||
description = self._html_search_meta(
|
||||
'description', webpage, default=None)
|
||||
timestamp = unified_timestamp(data.get('f_pgmtime'))
|
||||
duration = float_or_none(try_get(video, lambda x: x['totalLength']))
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': api_data['title'],
|
||||
'formats': self._extract_m3u8_formats(
|
||||
m3u8_url, video_id, 'mp4', 'm3u8_native', fatal=False),
|
||||
'duration': float_or_none(api_data.get('video', {}).get('totalLength')),
|
||||
'title': title,
|
||||
'description': description,
|
||||
'uploader': uploader,
|
||||
'timestamp': timestamp,
|
||||
'duration': duration,
|
||||
'formats': formats,
|
||||
}
|
||||
|
@ -1,6 +1,7 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import codecs
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
@ -24,7 +25,7 @@ class CDAIE(InfoExtractor):
|
||||
'height': 720,
|
||||
'title': 'Oto dlaczego przed zakrętem należy zwolnić.',
|
||||
'description': 'md5:269ccd135d550da90d1662651fcb9772',
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
'average_rating': float,
|
||||
'duration': 39
|
||||
}
|
||||
@ -36,7 +37,7 @@ class CDAIE(InfoExtractor):
|
||||
'ext': 'mp4',
|
||||
'title': 'Lądowanie na lotnisku na Maderze',
|
||||
'description': 'md5:60d76b71186dcce4e0ba6d4bbdb13e1a',
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
'uploader': 'crash404',
|
||||
'view_count': int,
|
||||
'average_rating': float,
|
||||
@ -96,6 +97,10 @@ class CDAIE(InfoExtractor):
|
||||
if not video or 'file' not in video:
|
||||
self.report_warning('Unable to extract %s version information' % version)
|
||||
return
|
||||
if video['file'].startswith('uggc'):
|
||||
video['file'] = codecs.decode(video['file'], 'rot_13')
|
||||
if video['file'].endswith('adc.mp4'):
|
||||
video['file'] = video['file'].replace('adc.mp4', '.mp4')
|
||||
f = {
|
||||
'url': video['file'],
|
||||
}
|
||||
|
@ -13,6 +13,7 @@ from ..utils import (
|
||||
float_or_none,
|
||||
sanitized_Request,
|
||||
urlencode_postdata,
|
||||
USER_AGENTS,
|
||||
)
|
||||
|
||||
|
||||
@ -21,11 +22,11 @@ class CeskaTelevizeIE(InfoExtractor):
|
||||
_TESTS = [{
|
||||
'url': 'http://www.ceskatelevize.cz/ivysilani/ivysilani/10441294653-hyde-park-civilizace/214411058091220',
|
||||
'info_dict': {
|
||||
'id': '61924494876951776',
|
||||
'id': '61924494877246241',
|
||||
'ext': 'mp4',
|
||||
'title': 'Hyde Park Civilizace',
|
||||
'description': 'md5:fe93f6eda372d150759d11644ebbfb4a',
|
||||
'thumbnail': 're:^https?://.*\.jpg',
|
||||
'title': 'Hyde Park Civilizace: Život v Grónsku',
|
||||
'description': 'md5:3fec8f6bb497be5cdb0c9e8781076626',
|
||||
'thumbnail': r're:^https?://.*\.jpg',
|
||||
'duration': 3350,
|
||||
},
|
||||
'params': {
|
||||
@ -39,7 +40,7 @@ class CeskaTelevizeIE(InfoExtractor):
|
||||
'ext': 'mp4',
|
||||
'title': 'Hyde Park Civilizace: Bonus 01 - En',
|
||||
'description': 'English Subtittles',
|
||||
'thumbnail': 're:^https?://.*\.jpg',
|
||||
'thumbnail': r're:^https?://.*\.jpg',
|
||||
'duration': 81.3,
|
||||
},
|
||||
'params': {
|
||||
@ -52,7 +53,7 @@ class CeskaTelevizeIE(InfoExtractor):
|
||||
'info_dict': {
|
||||
'id': 402,
|
||||
'ext': 'mp4',
|
||||
'title': 're:^ČT Sport \d{4}-\d{2}-\d{2} \d{2}:\d{2}$',
|
||||
'title': r're:^ČT Sport \d{4}-\d{2}-\d{2} \d{2}:\d{2}$',
|
||||
'is_live': True,
|
||||
},
|
||||
'params': {
|
||||
@ -80,7 +81,7 @@ class CeskaTelevizeIE(InfoExtractor):
|
||||
'id': '61924494877068022',
|
||||
'ext': 'mp4',
|
||||
'title': 'Queer: Bogotart (Queer)',
|
||||
'thumbnail': 're:^https?://.*\.jpg',
|
||||
'thumbnail': r're:^https?://.*\.jpg',
|
||||
'duration': 1558.3,
|
||||
},
|
||||
}],
|
||||
@ -114,70 +115,100 @@ class CeskaTelevizeIE(InfoExtractor):
|
||||
'requestSource': 'iVysilani',
|
||||
}
|
||||
|
||||
req = sanitized_Request(
|
||||
'http://www.ceskatelevize.cz/ivysilani/ajax/get-client-playlist',
|
||||
data=urlencode_postdata(data))
|
||||
|
||||
req.add_header('Content-type', 'application/x-www-form-urlencoded')
|
||||
req.add_header('x-addr', '127.0.0.1')
|
||||
req.add_header('X-Requested-With', 'XMLHttpRequest')
|
||||
req.add_header('Referer', url)
|
||||
|
||||
playlistpage = self._download_json(req, playlist_id)
|
||||
|
||||
playlist_url = playlistpage['url']
|
||||
if playlist_url == 'error_region':
|
||||
raise ExtractorError(NOT_AVAILABLE_STRING, expected=True)
|
||||
|
||||
req = sanitized_Request(compat_urllib_parse_unquote(playlist_url))
|
||||
req.add_header('Referer', url)
|
||||
|
||||
playlist_title = self._og_search_title(webpage, default=None)
|
||||
playlist_description = self._og_search_description(webpage, default=None)
|
||||
|
||||
playlist = self._download_json(req, playlist_id)['playlist']
|
||||
playlist_len = len(playlist)
|
||||
|
||||
entries = []
|
||||
for item in playlist:
|
||||
is_live = item.get('type') == 'LIVE'
|
||||
formats = []
|
||||
for format_id, stream_url in item['streamUrls'].items():
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
stream_url, playlist_id, 'mp4',
|
||||
entry_protocol='m3u8' if is_live else 'm3u8_native',
|
||||
fatal=False))
|
||||
self._sort_formats(formats)
|
||||
|
||||
item_id = item.get('id') or item['assetId']
|
||||
title = item['title']
|
||||
for user_agent in (None, USER_AGENTS['Safari']):
|
||||
req = sanitized_Request(
|
||||
'http://www.ceskatelevize.cz/ivysilani/ajax/get-client-playlist',
|
||||
data=urlencode_postdata(data))
|
||||
|
||||
duration = float_or_none(item.get('duration'))
|
||||
thumbnail = item.get('previewImageUrl')
|
||||
req.add_header('Content-type', 'application/x-www-form-urlencoded')
|
||||
req.add_header('x-addr', '127.0.0.1')
|
||||
req.add_header('X-Requested-With', 'XMLHttpRequest')
|
||||
if user_agent:
|
||||
req.add_header('User-Agent', user_agent)
|
||||
req.add_header('Referer', url)
|
||||
|
||||
subtitles = {}
|
||||
if item.get('type') == 'VOD':
|
||||
subs = item.get('subtitles')
|
||||
if subs:
|
||||
subtitles = self.extract_subtitles(episode_id, subs)
|
||||
playlistpage = self._download_json(req, playlist_id, fatal=False)
|
||||
|
||||
if playlist_len == 1:
|
||||
final_title = playlist_title or title
|
||||
if is_live:
|
||||
final_title = self._live_title(final_title)
|
||||
else:
|
||||
final_title = '%s (%s)' % (playlist_title, title)
|
||||
if not playlistpage:
|
||||
continue
|
||||
|
||||
entries.append({
|
||||
'id': item_id,
|
||||
'title': final_title,
|
||||
'description': playlist_description if playlist_len == 1 else None,
|
||||
'thumbnail': thumbnail,
|
||||
'duration': duration,
|
||||
'formats': formats,
|
||||
'subtitles': subtitles,
|
||||
'is_live': is_live,
|
||||
})
|
||||
playlist_url = playlistpage['url']
|
||||
if playlist_url == 'error_region':
|
||||
raise ExtractorError(NOT_AVAILABLE_STRING, expected=True)
|
||||
|
||||
req = sanitized_Request(compat_urllib_parse_unquote(playlist_url))
|
||||
req.add_header('Referer', url)
|
||||
|
||||
playlist_title = self._og_search_title(webpage, default=None)
|
||||
playlist_description = self._og_search_description(webpage, default=None)
|
||||
|
||||
playlist = self._download_json(req, playlist_id, fatal=False)
|
||||
if not playlist:
|
||||
continue
|
||||
|
||||
playlist = playlist.get('playlist')
|
||||
if not isinstance(playlist, list):
|
||||
continue
|
||||
|
||||
playlist_len = len(playlist)
|
||||
|
||||
for num, item in enumerate(playlist):
|
||||
is_live = item.get('type') == 'LIVE'
|
||||
formats = []
|
||||
for format_id, stream_url in item.get('streamUrls', {}).items():
|
||||
if 'playerType=flash' in stream_url:
|
||||
stream_formats = self._extract_m3u8_formats(
|
||||
stream_url, playlist_id, 'mp4',
|
||||
entry_protocol='m3u8' if is_live else 'm3u8_native',
|
||||
m3u8_id='hls-%s' % format_id, fatal=False)
|
||||
else:
|
||||
stream_formats = self._extract_mpd_formats(
|
||||
stream_url, playlist_id,
|
||||
mpd_id='dash-%s' % format_id, fatal=False)
|
||||
# See https://github.com/rg3/youtube-dl/issues/12119#issuecomment-280037031
|
||||
if format_id == 'audioDescription':
|
||||
for f in stream_formats:
|
||||
f['source_preference'] = -10
|
||||
formats.extend(stream_formats)
|
||||
|
||||
if user_agent and len(entries) == playlist_len:
|
||||
entries[num]['formats'].extend(formats)
|
||||
continue
|
||||
|
||||
item_id = item.get('id') or item['assetId']
|
||||
title = item['title']
|
||||
|
||||
duration = float_or_none(item.get('duration'))
|
||||
thumbnail = item.get('previewImageUrl')
|
||||
|
||||
subtitles = {}
|
||||
if item.get('type') == 'VOD':
|
||||
subs = item.get('subtitles')
|
||||
if subs:
|
||||
subtitles = self.extract_subtitles(episode_id, subs)
|
||||
|
||||
if playlist_len == 1:
|
||||
final_title = playlist_title or title
|
||||
if is_live:
|
||||
final_title = self._live_title(final_title)
|
||||
else:
|
||||
final_title = '%s (%s)' % (playlist_title, title)
|
||||
|
||||
entries.append({
|
||||
'id': item_id,
|
||||
'title': final_title,
|
||||
'description': playlist_description if playlist_len == 1 else None,
|
||||
'thumbnail': thumbnail,
|
||||
'duration': duration,
|
||||
'formats': formats,
|
||||
'subtitles': subtitles,
|
||||
'is_live': is_live,
|
||||
})
|
||||
|
||||
for e in entries:
|
||||
self._sort_formats(e['formats'])
|
||||
|
||||
return self.playlist_result(entries, playlist_id, playlist_title, playlist_description)
|
||||
|
||||
|
@ -31,7 +31,7 @@ class Channel9IE(InfoExtractor):
|
||||
'title': 'Developer Kick-Off Session: Stuff We Love',
|
||||
'description': 'md5:c08d72240b7c87fcecafe2692f80e35f',
|
||||
'duration': 4576,
|
||||
'thumbnail': 're:http://.*\.jpg',
|
||||
'thumbnail': r're:http://.*\.jpg',
|
||||
'session_code': 'KOS002',
|
||||
'session_day': 'Day 1',
|
||||
'session_room': 'Arena 1A',
|
||||
@ -47,7 +47,7 @@ class Channel9IE(InfoExtractor):
|
||||
'title': 'Self-service BI with Power BI - nuclear testing',
|
||||
'description': 'md5:d1e6ecaafa7fb52a2cacdf9599829f5b',
|
||||
'duration': 1540,
|
||||
'thumbnail': 're:http://.*\.jpg',
|
||||
'thumbnail': r're:http://.*\.jpg',
|
||||
'authors': ['Mike Wilmot'],
|
||||
},
|
||||
}, {
|
||||
@ -59,7 +59,7 @@ class Channel9IE(InfoExtractor):
|
||||
'title': 'Ranges for the Standard Library',
|
||||
'description': 'md5:2e6b4917677af3728c5f6d63784c4c5d',
|
||||
'duration': 5646,
|
||||
'thumbnail': 're:http://.*\.jpg',
|
||||
'thumbnail': r're:http://.*\.jpg',
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
|
@ -13,7 +13,7 @@ class CharlieRoseIE(InfoExtractor):
|
||||
'id': '27996',
|
||||
'ext': 'mp4',
|
||||
'title': 'Remembering Zaha Hadid',
|
||||
'thumbnail': 're:^https?://.*\.jpg\?\d+',
|
||||
'thumbnail': r're:^https?://.*\.jpg\?\d+',
|
||||
'description': 'We revisit past conversations with Zaha Hadid, in memory of the world renowned Iraqi architect.',
|
||||
'subtitles': {
|
||||
'en': [{
|
||||
|
@ -1,5 +1,7 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import ExtractorError
|
||||
|
||||
@ -31,30 +33,35 @@ class ChaturbateIE(InfoExtractor):
|
||||
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
m3u8_url = self._search_regex(
|
||||
r'src=(["\'])(?P<url>http.+?\.m3u8.*?)\1', webpage,
|
||||
'playlist', default=None, group='url')
|
||||
m3u8_formats = [(m.group('id').lower(), m.group('url')) for m in re.finditer(
|
||||
r'hlsSource(?P<id>.+?)\s*=\s*(?P<q>["\'])(?P<url>http.+?)(?P=q)', webpage)]
|
||||
|
||||
if not m3u8_url:
|
||||
if not m3u8_formats:
|
||||
error = self._search_regex(
|
||||
[r'<span[^>]+class=(["\'])desc_span\1[^>]*>(?P<error>[^<]+)</span>',
|
||||
r'<div[^>]+id=(["\'])defchat\1[^>]*>\s*<p><strong>(?P<error>[^<]+)<'],
|
||||
webpage, 'error', group='error', default=None)
|
||||
if not error:
|
||||
if any(p not in webpage for p in (
|
||||
if any(p in webpage for p in (
|
||||
self._ROOM_OFFLINE, 'offline_tipping', 'tip_offline')):
|
||||
error = self._ROOM_OFFLINE
|
||||
if error:
|
||||
raise ExtractorError(error, expected=True)
|
||||
raise ExtractorError('Unable to find stream URL')
|
||||
|
||||
formats = self._extract_m3u8_formats(m3u8_url, video_id, ext='mp4')
|
||||
formats = []
|
||||
for m3u8_id, m3u8_url in m3u8_formats:
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
m3u8_url, video_id, ext='mp4',
|
||||
# ffmpeg skips segments for fast m3u8
|
||||
preference=-10 if m3u8_id == 'fast' else None,
|
||||
m3u8_id=m3u8_id, fatal=False, live=True))
|
||||
self._sort_formats(formats)
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': self._live_title(video_id),
|
||||
'thumbnail': 'https://cdn-s.highwebmedia.com/uHK3McUtGCG3SMFcd4ZJsRv8/roomimage/%s.jpg' % video_id,
|
||||
'thumbnail': 'https://roomimg.stream.highwebmedia.com/ri/%s.jpg' % video_id,
|
||||
'age_limit': self._rta_search(webpage),
|
||||
'is_live': True,
|
||||
'formats': formats,
|
||||
|
@ -19,6 +19,7 @@ class ChirbitIE(InfoExtractor):
|
||||
'title': 'md5:f542ea253f5255240be4da375c6a5d7e',
|
||||
'description': 'md5:f24a4e22a71763e32da5fed59e47c770',
|
||||
'duration': 306,
|
||||
'uploader': 'Gerryaudio',
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
@ -54,6 +55,9 @@ class ChirbitIE(InfoExtractor):
|
||||
duration = parse_duration(self._search_regex(
|
||||
r'class=["\']c-length["\'][^>]*>([^<]+)',
|
||||
webpage, 'duration', fatal=False))
|
||||
uploader = self._search_regex(
|
||||
r'id=["\']chirbit-username["\'][^>]*>([^<]+)',
|
||||
webpage, 'uploader', fatal=False)
|
||||
|
||||
return {
|
||||
'id': audio_id,
|
||||
@ -61,6 +65,7 @@ class ChirbitIE(InfoExtractor):
|
||||
'title': title,
|
||||
'description': description,
|
||||
'duration': duration,
|
||||
'uploader': uploader,
|
||||
}
|
||||
|
||||
|
||||
|
@ -30,7 +30,7 @@ class CliphunterIE(InfoExtractor):
|
||||
'id': '1012420',
|
||||
'ext': 'flv',
|
||||
'title': 'Fun Jynx Maze solo',
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
'age_limit': 18,
|
||||
},
|
||||
'skip': 'Video gone',
|
||||
@ -41,7 +41,7 @@ class CliphunterIE(InfoExtractor):
|
||||
'id': '2019449',
|
||||
'ext': 'mp4',
|
||||
'title': 'ShesNew - My booty girlfriend, Victoria Paradice\'s pussy filled with jizz',
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
'age_limit': 18,
|
||||
},
|
||||
}]
|
||||
|
@ -18,7 +18,7 @@ class ClipsyndicateIE(InfoExtractor):
|
||||
'ext': 'mp4',
|
||||
'title': 'Brick Briscoe',
|
||||
'duration': 612,
|
||||
'thumbnail': 're:^https?://.+\.jpg',
|
||||
'thumbnail': r're:^https?://.+\.jpg',
|
||||
},
|
||||
}, {
|
||||
'url': 'http://chic.clipsyndicate.com/video/play/5844117/shark_attack',
|
||||
|
@ -19,7 +19,7 @@ class ClubicIE(InfoExtractor):
|
||||
'ext': 'mp4',
|
||||
'title': 'Clubic Week 2.0 : le FBI se lance dans la photo d\u0092identité',
|
||||
'description': 're:Gueule de bois chez Nokia. Le constructeur a indiqué cette.*',
|
||||
'thumbnail': 're:^http://img\.clubic\.com/.*\.jpg$',
|
||||
'thumbnail': r're:^http://img\.clubic\.com/.*\.jpg$',
|
||||
}
|
||||
}, {
|
||||
'url': 'http://www.clubic.com/video/video-clubic-week-2-0-apple-iphone-6s-et-plus-mais-surtout-le-pencil-469792.html',
|
||||
|
@ -1,13 +1,11 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .mtv import MTVIE
|
||||
from ..utils import ExtractorError
|
||||
|
||||
|
||||
class CMTIE(MTVIE):
|
||||
IE_NAME = 'cmt.com'
|
||||
_VALID_URL = r'https?://(?:www\.)?cmt\.com/(?:videos|shows)/(?:[^/]+/)*(?P<videoid>\d+)'
|
||||
_FEED_URL = 'http://www.cmt.com/sitewide/apps/player/embed/rss/'
|
||||
_VALID_URL = r'https?://(?:www\.)?cmt\.com/(?:videos|shows|(?:full-)?episodes|video-clips)/(?P<id>[^/]+)'
|
||||
|
||||
_TESTS = [{
|
||||
'url': 'http://www.cmt.com/videos/garth-brooks/989124/the-call-featuring-trisha-yearwood.jhtml#artist=30061',
|
||||
@ -33,17 +31,24 @@ class CMTIE(MTVIE):
|
||||
}, {
|
||||
'url': 'http://www.cmt.com/shows/party-down-south/party-down-south-ep-407-gone-girl/1738172/playlist/#id=1738172',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://www.cmt.com/full-episodes/537qb3/nashville-the-wayfaring-stranger-season-5-ep-501',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://www.cmt.com/video-clips/t9e4ci/nashville-juliette-in-2-minutes',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
@classmethod
|
||||
def _transform_rtmp_url(cls, rtmp_video_url):
|
||||
if 'error_not_available.swf' in rtmp_video_url:
|
||||
raise ExtractorError(
|
||||
'%s said: video is not available' % cls.IE_NAME, expected=True)
|
||||
|
||||
return super(CMTIE, cls)._transform_rtmp_url(rtmp_video_url)
|
||||
|
||||
def _extract_mgid(self, webpage):
|
||||
return self._search_regex(
|
||||
mgid = self._search_regex(
|
||||
r'MTVN\.VIDEO\.contentUri\s*=\s*([\'"])(?P<mgid>.+?)\1',
|
||||
webpage, 'mgid', group='mgid')
|
||||
webpage, 'mgid', group='mgid', default=None)
|
||||
if not mgid:
|
||||
mgid = self._extract_triforce_mgid(webpage)
|
||||
return mgid
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
mgid = self._extract_mgid(webpage)
|
||||
return self.url_result('http://media.mtvnservices.com/embed/%s' % mgid)
|
||||
|
@ -21,7 +21,7 @@ class CollegeRamaIE(InfoExtractor):
|
||||
'ext': 'mp4',
|
||||
'title': 'Een nieuwe wereld: waarden, bewustzijn en techniek van de mensheid 2.0.',
|
||||
'description': '',
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
'duration': 7713.088,
|
||||
'timestamp': 1413309600,
|
||||
'upload_date': '20141014',
|
||||
|
@ -48,15 +48,7 @@ class ComedyCentralFullEpisodesIE(MTVServicesInfoExtractor):
|
||||
def _real_extract(self, url):
|
||||
playlist_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, playlist_id)
|
||||
|
||||
feed_json = self._search_regex(r'var triforceManifestFeed\s*=\s*(\{.+?\});\n', webpage, 'triforce feeed')
|
||||
feed = self._parse_json(feed_json, playlist_id)
|
||||
zones = feed['manifest']['zones']
|
||||
|
||||
video_zone = zones['t2_lc_promo1']
|
||||
feed = self._download_json(video_zone['feed'], playlist_id)
|
||||
mgid = feed['result']['data']['id']
|
||||
|
||||
mgid = self._extract_triforce_mgid(webpage, data_zone='t2_lc_promo1')
|
||||
videos_info = self._get_videos_info(mgid)
|
||||
return videos_info
|
||||
|
||||
@ -79,7 +71,7 @@ class ToshIE(MTVServicesInfoExtractor):
|
||||
'ext': 'mp4',
|
||||
'title': 'Tosh.0|June 9, 2077|2|211|Twitter Users Share Summer Plans',
|
||||
'description': 'Tosh asked fans to share their summer plans.',
|
||||
'thumbnail': 're:^https?://.*\.jpg',
|
||||
'thumbnail': r're:^https?://.*\.jpg',
|
||||
# It's really reported to be published on year 2077
|
||||
'upload_date': '20770610',
|
||||
'timestamp': 3390510600,
|
||||
@ -93,12 +85,6 @@ class ToshIE(MTVServicesInfoExtractor):
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
@classmethod
|
||||
def _transform_rtmp_url(cls, rtmp_video_url):
|
||||
new_urls = super(ToshIE, cls)._transform_rtmp_url(rtmp_video_url)
|
||||
new_urls['rtmp'] = rtmp_video_url.replace('viacomccstrm', 'viacommtvstrm')
|
||||
return new_urls
|
||||
|
||||
|
||||
class ComedyCentralTVIE(MTVServicesInfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?comedycentral\.tv/(?:staffeln|shows)/(?P<id>[^/?#&]+)'
|
||||
|
@ -6,6 +6,7 @@ import hashlib
|
||||
import json
|
||||
import netrc
|
||||
import os
|
||||
import random
|
||||
import re
|
||||
import socket
|
||||
import sys
|
||||
@ -39,7 +40,10 @@ from ..utils import (
|
||||
ExtractorError,
|
||||
fix_xml_ampersands,
|
||||
float_or_none,
|
||||
GeoRestrictedError,
|
||||
GeoUtils,
|
||||
int_or_none,
|
||||
js_to_json,
|
||||
parse_iso8601,
|
||||
RegexNotFoundError,
|
||||
sanitize_filename,
|
||||
@ -59,6 +63,7 @@ from ..utils import (
|
||||
parse_m3u8_attributes,
|
||||
extract_attributes,
|
||||
parse_codecs,
|
||||
urljoin,
|
||||
)
|
||||
|
||||
|
||||
@ -120,9 +125,19 @@ class InfoExtractor(object):
|
||||
download, lower-case.
|
||||
"http", "https", "rtsp", "rtmp", "rtmpe",
|
||||
"m3u8", "m3u8_native" or "http_dash_segments".
|
||||
* fragments A list of fragments of the fragmented media,
|
||||
with the following entries:
|
||||
* "url" (mandatory) - fragment's URL
|
||||
* fragment_base_url
|
||||
Base URL for fragments. Each fragment's path
|
||||
value (if present) will be relative to
|
||||
this URL.
|
||||
* fragments A list of fragments of a fragmented media.
|
||||
Each fragment entry must contain either an url
|
||||
or a path. If an url is present it should be
|
||||
considered by a client. Otherwise both path and
|
||||
fragment_base_url must be present. Here is
|
||||
the list of all potential fields:
|
||||
* "url" - fragment's URL
|
||||
* "path" - fragment's path relative to
|
||||
fragment_base_url
|
||||
* "duration" (optional, int or float)
|
||||
* "filesize" (optional, int)
|
||||
* preference Order number of this format. If this field is
|
||||
@ -188,9 +203,10 @@ class InfoExtractor(object):
|
||||
uploader_url: Full URL to a personal webpage of the video uploader.
|
||||
location: Physical location where the video was filmed.
|
||||
subtitles: The available subtitles as a dictionary in the format
|
||||
{language: subformats}. "subformats" is a list sorted from
|
||||
lower to higher preference, each element is a dictionary
|
||||
with the "ext" entry and one of:
|
||||
{tag: subformats}. "tag" is usually a language code, and
|
||||
"subformats" is a list sorted from lower to higher
|
||||
preference, each element is a dictionary with the "ext"
|
||||
entry and one of:
|
||||
* "data": The subtitles file contents
|
||||
* "url": A URL pointing to the subtitles file
|
||||
"ext" will be calculated from URL if missing
|
||||
@ -307,17 +323,34 @@ class InfoExtractor(object):
|
||||
_real_extract() methods and define a _VALID_URL regexp.
|
||||
Probably, they should also be added to the list of extractors.
|
||||
|
||||
_GEO_BYPASS attribute may be set to False in order to disable
|
||||
geo restriction bypass mechanisms for a particular extractor.
|
||||
Though it won't disable explicit geo restriction bypass based on
|
||||
country code provided with geo_bypass_country. (experimental)
|
||||
|
||||
_GEO_COUNTRIES attribute may contain a list of presumably geo unrestricted
|
||||
countries for this extractor. One of these countries will be used by
|
||||
geo restriction bypass mechanism right away in order to bypass
|
||||
geo restriction, of course, if the mechanism is not disabled. (experimental)
|
||||
|
||||
NB: both these geo attributes are experimental and may change in future
|
||||
or be completely removed.
|
||||
|
||||
Finally, the _WORKING attribute should be set to False for broken IEs
|
||||
in order to warn the users and skip the tests.
|
||||
"""
|
||||
|
||||
_ready = False
|
||||
_downloader = None
|
||||
_x_forwarded_for_ip = None
|
||||
_GEO_BYPASS = True
|
||||
_GEO_COUNTRIES = None
|
||||
_WORKING = True
|
||||
|
||||
def __init__(self, downloader=None):
|
||||
"""Constructor. Receives an optional downloader."""
|
||||
self._ready = False
|
||||
self._x_forwarded_for_ip = None
|
||||
self.set_downloader(downloader)
|
||||
|
||||
@classmethod
|
||||
@ -346,15 +379,59 @@ class InfoExtractor(object):
|
||||
|
||||
def initialize(self):
|
||||
"""Initializes an instance (authentication, etc)."""
|
||||
self._initialize_geo_bypass(self._GEO_COUNTRIES)
|
||||
if not self._ready:
|
||||
self._real_initialize()
|
||||
self._ready = True
|
||||
|
||||
def _initialize_geo_bypass(self, countries):
|
||||
"""
|
||||
Initialize geo restriction bypass mechanism.
|
||||
|
||||
This method is used to initialize geo bypass mechanism based on faking
|
||||
X-Forwarded-For HTTP header. A random country from provided country list
|
||||
is selected and a random IP belonging to this country is generated. This
|
||||
IP will be passed as X-Forwarded-For HTTP header in all subsequent
|
||||
HTTP requests.
|
||||
|
||||
This method will be used for initial geo bypass mechanism initialization
|
||||
during the instance initialization with _GEO_COUNTRIES.
|
||||
|
||||
You may also manually call it from extractor's code if geo countries
|
||||
information is not available beforehand (e.g. obtained during
|
||||
extraction) or due to some another reason.
|
||||
"""
|
||||
if not self._x_forwarded_for_ip:
|
||||
country_code = self._downloader.params.get('geo_bypass_country', None)
|
||||
# If there is no explicit country for geo bypass specified and
|
||||
# the extractor is known to be geo restricted let's fake IP
|
||||
# as X-Forwarded-For right away.
|
||||
if (not country_code and
|
||||
self._GEO_BYPASS and
|
||||
self._downloader.params.get('geo_bypass', True) and
|
||||
countries):
|
||||
country_code = random.choice(countries)
|
||||
if country_code:
|
||||
self._x_forwarded_for_ip = GeoUtils.random_ipv4(country_code)
|
||||
if self._downloader.params.get('verbose', False):
|
||||
self._downloader.to_stdout(
|
||||
'[debug] Using fake IP %s (%s) as X-Forwarded-For.'
|
||||
% (self._x_forwarded_for_ip, country_code.upper()))
|
||||
|
||||
def extract(self, url):
|
||||
"""Extracts URL information and returns it in list of dicts."""
|
||||
try:
|
||||
self.initialize()
|
||||
return self._real_extract(url)
|
||||
for _ in range(2):
|
||||
try:
|
||||
self.initialize()
|
||||
ie_result = self._real_extract(url)
|
||||
if self._x_forwarded_for_ip:
|
||||
ie_result['__x_forwarded_for_ip'] = self._x_forwarded_for_ip
|
||||
return ie_result
|
||||
except GeoRestrictedError as e:
|
||||
if self.__maybe_fake_ip_and_retry(e.countries):
|
||||
continue
|
||||
raise
|
||||
except ExtractorError:
|
||||
raise
|
||||
except compat_http_client.IncompleteRead as e:
|
||||
@ -362,6 +439,21 @@ class InfoExtractor(object):
|
||||
except (KeyError, StopIteration) as e:
|
||||
raise ExtractorError('An extractor error has occurred.', cause=e)
|
||||
|
||||
def __maybe_fake_ip_and_retry(self, countries):
|
||||
if (not self._downloader.params.get('geo_bypass_country', None) and
|
||||
self._GEO_BYPASS and
|
||||
self._downloader.params.get('geo_bypass', True) and
|
||||
not self._x_forwarded_for_ip and
|
||||
countries):
|
||||
country_code = random.choice(countries)
|
||||
self._x_forwarded_for_ip = GeoUtils.random_ipv4(country_code)
|
||||
if self._x_forwarded_for_ip:
|
||||
self.report_warning(
|
||||
'Video is geo restricted. Retrying extraction with fake IP %s (%s) as X-Forwarded-For.'
|
||||
% (self._x_forwarded_for_ip, country_code.upper()))
|
||||
return True
|
||||
return False
|
||||
|
||||
def set_downloader(self, downloader):
|
||||
"""Sets the downloader for this IE."""
|
||||
self._downloader = downloader
|
||||
@ -421,6 +513,15 @@ class InfoExtractor(object):
|
||||
if isinstance(url_or_request, (compat_str, str)):
|
||||
url_or_request = url_or_request.partition('#')[0]
|
||||
|
||||
# Some sites check X-Forwarded-For HTTP header in order to figure out
|
||||
# the origin of the client behind proxy. This allows bypassing geo
|
||||
# restriction by faking this header's value to IP that belongs to some
|
||||
# geo unrestricted country. We will do so once we encounter any
|
||||
# geo restriction error.
|
||||
if self._x_forwarded_for_ip:
|
||||
if 'X-Forwarded-For' not in headers:
|
||||
headers['X-Forwarded-For'] = self._x_forwarded_for_ip
|
||||
|
||||
urlh = self._request_webpage(url_or_request, video_id, note, errnote, fatal, data=data, headers=headers, query=query)
|
||||
if urlh is False:
|
||||
assert not fatal
|
||||
@ -596,10 +697,8 @@ class InfoExtractor(object):
|
||||
expected=True)
|
||||
|
||||
@staticmethod
|
||||
def raise_geo_restricted(msg='This video is not available from your location due to geo restriction'):
|
||||
raise ExtractorError(
|
||||
'%s. You might want to use --proxy to workaround.' % msg,
|
||||
expected=True)
|
||||
def raise_geo_restricted(msg='This video is not available from your location due to geo restriction', countries=None):
|
||||
raise GeoRestrictedError(msg, countries=countries)
|
||||
|
||||
# Methods for following #608
|
||||
@staticmethod
|
||||
@ -1013,13 +1112,13 @@ class InfoExtractor(object):
|
||||
unique_formats.append(f)
|
||||
formats[:] = unique_formats
|
||||
|
||||
def _is_valid_url(self, url, video_id, item='video'):
|
||||
def _is_valid_url(self, url, video_id, item='video', headers={}):
|
||||
url = self._proto_relative_url(url, scheme='http:')
|
||||
# For now assume non HTTP(S) URLs always valid
|
||||
if not (url.startswith('http://') or url.startswith('https://')):
|
||||
return True
|
||||
try:
|
||||
self._request_webpage(url, video_id, 'Checking %s URL' % item)
|
||||
self._request_webpage(url, video_id, 'Checking %s URL' % item, headers=headers)
|
||||
return True
|
||||
except ExtractorError as e:
|
||||
if isinstance(e.cause, compat_urllib_error.URLError):
|
||||
@ -1196,6 +1295,9 @@ class InfoExtractor(object):
|
||||
m3u8_doc, urlh = res
|
||||
m3u8_url = urlh.geturl()
|
||||
|
||||
if '#EXT-X-FAXS-CM:' in m3u8_doc: # Adobe Flash Access
|
||||
return []
|
||||
|
||||
formats = [self._m3u8_meta_format(m3u8_url, ext, preference, m3u8_id)]
|
||||
|
||||
format_url = lambda u: (
|
||||
@ -1224,7 +1326,7 @@ class InfoExtractor(object):
|
||||
'protocol': entry_protocol,
|
||||
'preference': preference,
|
||||
}]
|
||||
audio_groups = set()
|
||||
audio_in_video_stream = {}
|
||||
last_info = {}
|
||||
last_media = {}
|
||||
for line in m3u8_doc.splitlines():
|
||||
@ -1234,10 +1336,11 @@ class InfoExtractor(object):
|
||||
media = parse_m3u8_attributes(line)
|
||||
media_type = media.get('TYPE')
|
||||
if media_type in ('VIDEO', 'AUDIO'):
|
||||
group_id = media.get('GROUP-ID')
|
||||
media_url = media.get('URI')
|
||||
if media_url:
|
||||
format_id = []
|
||||
for v in (media.get('GROUP-ID'), media.get('NAME')):
|
||||
for v in (group_id, media.get('NAME')):
|
||||
if v:
|
||||
format_id.append(v)
|
||||
f = {
|
||||
@ -1250,12 +1353,15 @@ class InfoExtractor(object):
|
||||
}
|
||||
if media_type == 'AUDIO':
|
||||
f['vcodec'] = 'none'
|
||||
audio_groups.add(media['GROUP-ID'])
|
||||
if group_id and not audio_in_video_stream.get(group_id):
|
||||
audio_in_video_stream[group_id] = False
|
||||
formats.append(f)
|
||||
else:
|
||||
# When there is no URI in EXT-X-MEDIA let this tag's
|
||||
# data be used by regular URI lines below
|
||||
last_media = media
|
||||
if media_type == 'AUDIO' and group_id:
|
||||
audio_in_video_stream[group_id] = True
|
||||
elif line.startswith('#') or not line.strip():
|
||||
continue
|
||||
else:
|
||||
@ -1299,8 +1405,8 @@ class InfoExtractor(object):
|
||||
'abr': abr,
|
||||
})
|
||||
f.update(parse_codecs(last_info.get('CODECS')))
|
||||
if last_info.get('AUDIO') in audio_groups:
|
||||
# TODO: update acodec for for audio only formats with the same GROUP-ID
|
||||
if audio_in_video_stream.get(last_info.get('AUDIO')) is False and f['vcodec'] != 'none':
|
||||
# TODO: update acodec for audio only formats with the same GROUP-ID
|
||||
f['acodec'] = 'none'
|
||||
formats.append(f)
|
||||
last_info = {}
|
||||
@ -1621,21 +1727,16 @@ class InfoExtractor(object):
|
||||
segment_template = element.find(_add_ns('SegmentTemplate'))
|
||||
if segment_template is not None:
|
||||
extract_common(segment_template)
|
||||
media_template = segment_template.get('media')
|
||||
if media_template:
|
||||
ms_info['media_template'] = media_template
|
||||
media = segment_template.get('media')
|
||||
if media:
|
||||
ms_info['media'] = media
|
||||
initialization = segment_template.get('initialization')
|
||||
if initialization:
|
||||
ms_info['initialization_url'] = initialization
|
||||
ms_info['initialization'] = initialization
|
||||
else:
|
||||
extract_Initialization(segment_template)
|
||||
return ms_info
|
||||
|
||||
def combine_url(base_url, target_url):
|
||||
if re.match(r'^https?://', target_url):
|
||||
return target_url
|
||||
return '%s%s%s' % (base_url, '' if base_url.endswith('/') else '/', target_url)
|
||||
|
||||
mpd_duration = parse_duration(mpd_doc.get('mediaPresentationDuration'))
|
||||
formats = []
|
||||
for period in mpd_doc.findall(_add_ns('Period')):
|
||||
@ -1675,6 +1776,7 @@ class InfoExtractor(object):
|
||||
lang = representation_attrib.get('lang')
|
||||
url_el = representation.find(_add_ns('BaseURL'))
|
||||
filesize = int_or_none(url_el.attrib.get('{http://youtube.com/yt/2012/10/10}contentLength') if url_el is not None else None)
|
||||
bandwidth = int_or_none(representation_attrib.get('bandwidth'))
|
||||
f = {
|
||||
'format_id': '%s-%s' % (mpd_id, representation_id) if mpd_id else representation_id,
|
||||
'url': base_url,
|
||||
@ -1682,23 +1784,41 @@ class InfoExtractor(object):
|
||||
'ext': mimetype2ext(mime_type),
|
||||
'width': int_or_none(representation_attrib.get('width')),
|
||||
'height': int_or_none(representation_attrib.get('height')),
|
||||
'tbr': int_or_none(representation_attrib.get('bandwidth'), 1000),
|
||||
'tbr': int_or_none(bandwidth, 1000),
|
||||
'asr': int_or_none(representation_attrib.get('audioSamplingRate')),
|
||||
'fps': int_or_none(representation_attrib.get('frameRate')),
|
||||
'vcodec': 'none' if content_type == 'audio' else representation_attrib.get('codecs'),
|
||||
'acodec': 'none' if content_type == 'video' else representation_attrib.get('codecs'),
|
||||
'language': lang if lang not in ('mul', 'und', 'zxx', 'mis') else None,
|
||||
'format_note': 'DASH %s' % content_type,
|
||||
'filesize': filesize,
|
||||
}
|
||||
f.update(parse_codecs(representation_attrib.get('codecs')))
|
||||
representation_ms_info = extract_multisegment_info(representation, adaption_set_ms_info)
|
||||
if 'segment_urls' not in representation_ms_info and 'media_template' in representation_ms_info:
|
||||
|
||||
media_template = representation_ms_info['media_template']
|
||||
media_template = media_template.replace('$RepresentationID$', representation_id)
|
||||
media_template = re.sub(r'\$(Number|Bandwidth|Time)\$', r'%(\1)d', media_template)
|
||||
media_template = re.sub(r'\$(Number|Bandwidth|Time)%([^$]+)\$', r'%(\1)\2', media_template)
|
||||
media_template.replace('$$', '$')
|
||||
def prepare_template(template_name, identifiers):
|
||||
t = representation_ms_info[template_name]
|
||||
t = t.replace('$RepresentationID$', representation_id)
|
||||
t = re.sub(r'\$(%s)\$' % '|'.join(identifiers), r'%(\1)d', t)
|
||||
t = re.sub(r'\$(%s)%%([^$]+)\$' % '|'.join(identifiers), r'%(\1)\2', t)
|
||||
t.replace('$$', '$')
|
||||
return t
|
||||
|
||||
# @initialization is a regular template like @media one
|
||||
# so it should be handled just the same way (see
|
||||
# https://github.com/rg3/youtube-dl/issues/11605)
|
||||
if 'initialization' in representation_ms_info:
|
||||
initialization_template = prepare_template(
|
||||
'initialization',
|
||||
# As per [1, 5.3.9.4.2, Table 15, page 54] $Number$ and
|
||||
# $Time$ shall not be included for @initialization thus
|
||||
# only $Bandwidth$ remains
|
||||
('Bandwidth', ))
|
||||
representation_ms_info['initialization_url'] = initialization_template % {
|
||||
'Bandwidth': bandwidth,
|
||||
}
|
||||
|
||||
if 'segment_urls' not in representation_ms_info and 'media' in representation_ms_info:
|
||||
|
||||
media_template = prepare_template('media', ('Number', 'Bandwidth', 'Time'))
|
||||
|
||||
# As per [1, 5.3.9.4.4, Table 16, page 55] $Number$ and $Time$
|
||||
# can't be used at the same time
|
||||
@ -1710,7 +1830,7 @@ class InfoExtractor(object):
|
||||
representation_ms_info['fragments'] = [{
|
||||
'url': media_template % {
|
||||
'Number': segment_number,
|
||||
'Bandwidth': int_or_none(representation_attrib.get('bandwidth')),
|
||||
'Bandwidth': bandwidth,
|
||||
},
|
||||
'duration': segment_duration,
|
||||
} for segment_number in range(
|
||||
@ -1728,7 +1848,7 @@ class InfoExtractor(object):
|
||||
def add_segment_url():
|
||||
segment_url = media_template % {
|
||||
'Time': segment_time,
|
||||
'Bandwidth': int_or_none(representation_attrib.get('bandwidth')),
|
||||
'Bandwidth': bandwidth,
|
||||
'Number': segment_number,
|
||||
}
|
||||
representation_ms_info['fragments'].append({
|
||||
@ -1751,14 +1871,16 @@ class InfoExtractor(object):
|
||||
# Example: https://www.youtube.com/watch?v=iXZV5uAYMJI
|
||||
# or any YouTube dashsegments video
|
||||
fragments = []
|
||||
s_num = 0
|
||||
for segment_url in representation_ms_info['segment_urls']:
|
||||
s = representation_ms_info['s'][s_num]
|
||||
segment_index = 0
|
||||
timescale = representation_ms_info['timescale']
|
||||
for s in representation_ms_info['s']:
|
||||
duration = float_or_none(s['d'], timescale)
|
||||
for r in range(s.get('r', 0) + 1):
|
||||
fragments.append({
|
||||
'url': segment_url,
|
||||
'duration': float_or_none(s['d'], representation_ms_info['timescale']),
|
||||
'url': representation_ms_info['segment_urls'][segment_index],
|
||||
'duration': duration,
|
||||
})
|
||||
segment_index += 1
|
||||
representation_ms_info['fragments'] = fragments
|
||||
# NB: MPD manifest may contain direct URLs to unfragmented media.
|
||||
# No fragments key is present in this case.
|
||||
@ -1768,13 +1890,13 @@ class InfoExtractor(object):
|
||||
'protocol': 'http_dash_segments',
|
||||
})
|
||||
if 'initialization_url' in representation_ms_info:
|
||||
initialization_url = representation_ms_info['initialization_url'].replace('$RepresentationID$', representation_id)
|
||||
initialization_url = representation_ms_info['initialization_url']
|
||||
if not f.get('url'):
|
||||
f['url'] = initialization_url
|
||||
f['fragments'].append({'url': initialization_url})
|
||||
f['fragments'].extend(representation_ms_info['fragments'])
|
||||
for fragment in f['fragments']:
|
||||
fragment['url'] = combine_url(base_url, fragment['url'])
|
||||
fragment['url'] = urljoin(base_url, fragment['url'])
|
||||
try:
|
||||
existing_format = next(
|
||||
fo for fo in formats
|
||||
@ -1888,7 +2010,7 @@ class InfoExtractor(object):
|
||||
})
|
||||
return formats
|
||||
|
||||
def _parse_html5_media_entries(self, base_url, webpage, video_id, m3u8_id=None, m3u8_entry_protocol='m3u8'):
|
||||
def _parse_html5_media_entries(self, base_url, webpage, video_id, m3u8_id=None, m3u8_entry_protocol='m3u8', mpd_id=None, preference=None):
|
||||
def absolute_url(video_url):
|
||||
return compat_urlparse.urljoin(base_url, video_url)
|
||||
|
||||
@ -1905,11 +2027,17 @@ class InfoExtractor(object):
|
||||
|
||||
def _media_formats(src, cur_media_type):
|
||||
full_url = absolute_url(src)
|
||||
if determine_ext(full_url) == 'm3u8':
|
||||
ext = determine_ext(full_url)
|
||||
if ext == 'm3u8':
|
||||
is_plain_url = False
|
||||
formats = self._extract_m3u8_formats(
|
||||
full_url, video_id, ext='mp4',
|
||||
entry_protocol=m3u8_entry_protocol, m3u8_id=m3u8_id)
|
||||
entry_protocol=m3u8_entry_protocol, m3u8_id=m3u8_id,
|
||||
preference=preference)
|
||||
elif ext == 'mpd':
|
||||
is_plain_url = False
|
||||
formats = self._extract_mpd_formats(
|
||||
full_url, video_id, mpd_id=mpd_id)
|
||||
else:
|
||||
is_plain_url = True
|
||||
formats = [{
|
||||
@ -1922,7 +2050,12 @@ class InfoExtractor(object):
|
||||
media_tags = [(media_tag, media_type, '')
|
||||
for media_tag, media_type
|
||||
in re.findall(r'(?s)(<(video|audio)[^>]*/>)', webpage)]
|
||||
media_tags.extend(re.findall(r'(?s)(<(?P<tag>video|audio)[^>]*>)(.*?)</(?P=tag)>', webpage))
|
||||
media_tags.extend(re.findall(
|
||||
# We only allow video|audio followed by a whitespace or '>'.
|
||||
# Allowing more characters may end up in significant slow down (see
|
||||
# https://github.com/rg3/youtube-dl/issues/11979, example URL:
|
||||
# http://www.porntrex.com/maps/videositemap.xml).
|
||||
r'(?s)(<(?P<tag>video|audio)(?:\s+[^>]*)?>)(.*?)</(?P=tag)>', webpage))
|
||||
for media_tag, media_type, media_content in media_tags:
|
||||
media_info = {
|
||||
'formats': [],
|
||||
@ -1962,10 +2095,13 @@ class InfoExtractor(object):
|
||||
entries.append(media_info)
|
||||
return entries
|
||||
|
||||
def _extract_akamai_formats(self, manifest_url, video_id):
|
||||
def _extract_akamai_formats(self, manifest_url, video_id, hosts={}):
|
||||
formats = []
|
||||
hdcore_sign = 'hdcore=3.7.0'
|
||||
f4m_url = re.sub(r'(https?://.+?)/i/', r'\1/z/', manifest_url).replace('/master.m3u8', '/manifest.f4m')
|
||||
f4m_url = re.sub(r'(https?://[^/+])/i/', r'\1/z/', manifest_url).replace('/master.m3u8', '/manifest.f4m')
|
||||
hds_host = hosts.get('hds')
|
||||
if hds_host:
|
||||
f4m_url = re.sub(r'(https?://)[^/]+', r'\1' + hds_host, f4m_url)
|
||||
if 'hdcore=' not in f4m_url:
|
||||
f4m_url += ('&' if '?' in f4m_url else '?') + hdcore_sign
|
||||
f4m_formats = self._extract_f4m_formats(
|
||||
@ -1973,7 +2109,10 @@ class InfoExtractor(object):
|
||||
for entry in f4m_formats:
|
||||
entry.update({'extra_param_to_segment_url': hdcore_sign})
|
||||
formats.extend(f4m_formats)
|
||||
m3u8_url = re.sub(r'(https?://.+?)/z/', r'\1/i/', manifest_url).replace('/manifest.f4m', '/master.m3u8')
|
||||
m3u8_url = re.sub(r'(https?://[^/]+)/z/', r'\1/i/', manifest_url).replace('/manifest.f4m', '/master.m3u8')
|
||||
hls_host = hosts.get('hls')
|
||||
if hls_host:
|
||||
m3u8_url = re.sub(r'(https?://)[^/]+', r'\1' + hls_host, m3u8_url)
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
m3u8_url, video_id, 'mp4', 'm3u8_native',
|
||||
m3u8_id='hls', fatal=False))
|
||||
@ -2022,6 +2161,132 @@ class InfoExtractor(object):
|
||||
})
|
||||
return formats
|
||||
|
||||
@staticmethod
|
||||
def _find_jwplayer_data(webpage):
|
||||
mobj = re.search(
|
||||
r'jwplayer\((?P<quote>[\'"])[^\'" ]+(?P=quote)\)\.setup\s*\((?P<options>[^)]+)\)',
|
||||
webpage)
|
||||
if mobj:
|
||||
return mobj.group('options')
|
||||
|
||||
def _extract_jwplayer_data(self, webpage, video_id, *args, **kwargs):
|
||||
jwplayer_data = self._parse_json(
|
||||
self._find_jwplayer_data(webpage), video_id,
|
||||
transform_source=js_to_json)
|
||||
return self._parse_jwplayer_data(
|
||||
jwplayer_data, video_id, *args, **kwargs)
|
||||
|
||||
def _parse_jwplayer_data(self, jwplayer_data, video_id=None, require_title=True,
|
||||
m3u8_id=None, mpd_id=None, rtmp_params=None, base_url=None):
|
||||
# JWPlayer backward compatibility: flattened playlists
|
||||
# https://github.com/jwplayer/jwplayer/blob/v7.4.3/src/js/api/config.js#L81-L96
|
||||
if 'playlist' not in jwplayer_data:
|
||||
jwplayer_data = {'playlist': [jwplayer_data]}
|
||||
|
||||
entries = []
|
||||
|
||||
# JWPlayer backward compatibility: single playlist item
|
||||
# https://github.com/jwplayer/jwplayer/blob/v7.7.0/src/js/playlist/playlist.js#L10
|
||||
if not isinstance(jwplayer_data['playlist'], list):
|
||||
jwplayer_data['playlist'] = [jwplayer_data['playlist']]
|
||||
|
||||
for video_data in jwplayer_data['playlist']:
|
||||
# JWPlayer backward compatibility: flattened sources
|
||||
# https://github.com/jwplayer/jwplayer/blob/v7.4.3/src/js/playlist/item.js#L29-L35
|
||||
if 'sources' not in video_data:
|
||||
video_data['sources'] = [video_data]
|
||||
|
||||
this_video_id = video_id or video_data['mediaid']
|
||||
|
||||
formats = self._parse_jwplayer_formats(
|
||||
video_data['sources'], video_id=this_video_id, m3u8_id=m3u8_id,
|
||||
mpd_id=mpd_id, rtmp_params=rtmp_params, base_url=base_url)
|
||||
self._sort_formats(formats)
|
||||
|
||||
subtitles = {}
|
||||
tracks = video_data.get('tracks')
|
||||
if tracks and isinstance(tracks, list):
|
||||
for track in tracks:
|
||||
if track.get('kind') != 'captions':
|
||||
continue
|
||||
track_url = urljoin(base_url, track.get('file'))
|
||||
if not track_url:
|
||||
continue
|
||||
subtitles.setdefault(track.get('label') or 'en', []).append({
|
||||
'url': self._proto_relative_url(track_url)
|
||||
})
|
||||
|
||||
entries.append({
|
||||
'id': this_video_id,
|
||||
'title': video_data['title'] if require_title else video_data.get('title'),
|
||||
'description': video_data.get('description'),
|
||||
'thumbnail': self._proto_relative_url(video_data.get('image')),
|
||||
'timestamp': int_or_none(video_data.get('pubdate')),
|
||||
'duration': float_or_none(jwplayer_data.get('duration') or video_data.get('duration')),
|
||||
'subtitles': subtitles,
|
||||
'formats': formats,
|
||||
})
|
||||
if len(entries) == 1:
|
||||
return entries[0]
|
||||
else:
|
||||
return self.playlist_result(entries)
|
||||
|
||||
def _parse_jwplayer_formats(self, jwplayer_sources_data, video_id=None,
|
||||
m3u8_id=None, mpd_id=None, rtmp_params=None, base_url=None):
|
||||
formats = []
|
||||
for source in jwplayer_sources_data:
|
||||
source_url = self._proto_relative_url(source['file'])
|
||||
if base_url:
|
||||
source_url = compat_urlparse.urljoin(base_url, source_url)
|
||||
source_type = source.get('type') or ''
|
||||
ext = mimetype2ext(source_type) or determine_ext(source_url)
|
||||
if source_type == 'hls' or ext == 'm3u8':
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
source_url, video_id, 'mp4', entry_protocol='m3u8_native',
|
||||
m3u8_id=m3u8_id, fatal=False))
|
||||
elif ext == 'mpd':
|
||||
formats.extend(self._extract_mpd_formats(
|
||||
source_url, video_id, mpd_id=mpd_id, fatal=False))
|
||||
# https://github.com/jwplayer/jwplayer/blob/master/src/js/providers/default.js#L67
|
||||
elif source_type.startswith('audio') or ext in (
|
||||
'oga', 'aac', 'mp3', 'mpeg', 'vorbis'):
|
||||
formats.append({
|
||||
'url': source_url,
|
||||
'vcodec': 'none',
|
||||
'ext': ext,
|
||||
})
|
||||
else:
|
||||
height = int_or_none(source.get('height'))
|
||||
if height is None:
|
||||
# Often no height is provided but there is a label in
|
||||
# format like "1080p", "720p SD", or 1080.
|
||||
height = int_or_none(self._search_regex(
|
||||
r'^(\d{3,4})[pP]?(?:\b|$)', compat_str(source.get('label') or ''),
|
||||
'height', default=None))
|
||||
a_format = {
|
||||
'url': source_url,
|
||||
'width': int_or_none(source.get('width')),
|
||||
'height': height,
|
||||
'tbr': int_or_none(source.get('bitrate')),
|
||||
'ext': ext,
|
||||
}
|
||||
if source_url.startswith('rtmp'):
|
||||
a_format['ext'] = 'flv'
|
||||
# See com/longtailvideo/jwplayer/media/RTMPMediaProvider.as
|
||||
# of jwplayer.flash.swf
|
||||
rtmp_url_parts = re.split(
|
||||
r'((?:mp4|mp3|flv):)', source_url, 1)
|
||||
if len(rtmp_url_parts) == 3:
|
||||
rtmp_url, prefix, play_path = rtmp_url_parts
|
||||
a_format.update({
|
||||
'url': rtmp_url,
|
||||
'play_path': prefix + play_path,
|
||||
})
|
||||
if rtmp_params:
|
||||
a_format.update(rtmp_params)
|
||||
formats.append(a_format)
|
||||
return formats
|
||||
|
||||
def _live_title(self, name):
|
||||
""" Generate the title for a live video """
|
||||
now = datetime.datetime.now()
|
||||
|
@ -1,5 +1,7 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import sys
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import ExtractorError
|
||||
|
||||
@ -7,7 +9,7 @@ from ..utils import ExtractorError
|
||||
class CommonMistakesIE(InfoExtractor):
|
||||
IE_DESC = False # Do not list
|
||||
_VALID_URL = r'''(?x)
|
||||
(?:url|URL)
|
||||
(?:url|URL)$
|
||||
'''
|
||||
|
||||
_TESTS = [{
|
||||
@ -33,7 +35,9 @@ class UnicodeBOMIE(InfoExtractor):
|
||||
IE_DESC = False
|
||||
_VALID_URL = r'(?P<bom>\ufeff)(?P<id>.*)$'
|
||||
|
||||
_TESTS = [{
|
||||
# Disable test for python 3.2 since BOM is broken in re in this version
|
||||
# (see https://github.com/rg3/youtube-dl/issues/9751)
|
||||
_TESTS = [] if (3, 0) < sys.version_info <= (3, 3) else [{
|
||||
'url': '\ufeffhttp://www.youtube.com/watch?v=BaW_jenozKc',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
72
youtube_dl/extractor/corus.py
Normal file
72
youtube_dl/extractor/corus.py
Normal file
@ -0,0 +1,72 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .theplatform import ThePlatformFeedIE
|
||||
from ..utils import int_or_none
|
||||
|
||||
|
||||
class CorusIE(ThePlatformFeedIE):
|
||||
_VALID_URL = r'https?://(?:www\.)?(?P<domain>(?:globaltv|etcanada)\.com|(?:hgtv|foodnetwork|slice)\.ca)/(?:video/|(?:[^/]+/)+(?:videos/[a-z0-9-]+-|video\.html\?.*?\bv=))(?P<id>\d+)'
|
||||
_TESTS = [{
|
||||
'url': 'http://www.hgtv.ca/shows/bryan-inc/videos/movie-night-popcorn-with-bryan-870923331648/',
|
||||
'md5': '05dcbca777bf1e58c2acbb57168ad3a6',
|
||||
'info_dict': {
|
||||
'id': '870923331648',
|
||||
'ext': 'mp4',
|
||||
'title': 'Movie Night Popcorn with Bryan',
|
||||
'description': 'Bryan whips up homemade popcorn, the old fashion way for Jojo and Lincoln.',
|
||||
'uploader': 'SHWM-NEW',
|
||||
'upload_date': '20170206',
|
||||
'timestamp': 1486392197,
|
||||
},
|
||||
}, {
|
||||
'url': 'http://www.foodnetwork.ca/shows/chopped/video/episode/chocolate-obsession/video.html?v=872683587753',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://etcanada.com/video/873675331955/meet-the-survivor-game-changers-castaways-part-2/',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
_TP_FEEDS = {
|
||||
'globaltv': {
|
||||
'feed_id': 'ChQqrem0lNUp',
|
||||
'account_id': 2269680845,
|
||||
},
|
||||
'etcanada': {
|
||||
'feed_id': 'ChQqrem0lNUp',
|
||||
'account_id': 2269680845,
|
||||
},
|
||||
'hgtv': {
|
||||
'feed_id': 'L0BMHXi2no43',
|
||||
'account_id': 2414428465,
|
||||
},
|
||||
'foodnetwork': {
|
||||
'feed_id': 'ukK8o58zbRmJ',
|
||||
'account_id': 2414429569,
|
||||
},
|
||||
'slice': {
|
||||
'feed_id': '5tUJLgV2YNJ5',
|
||||
'account_id': 2414427935,
|
||||
},
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
domain, video_id = re.match(self._VALID_URL, url).groups()
|
||||
feed_info = self._TP_FEEDS[domain.split('.')[0]]
|
||||
return self._extract_feed_info('dtjsEC', feed_info['feed_id'], 'byId=' + video_id, video_id, lambda e: {
|
||||
'episode_number': int_or_none(e.get('pl1$episode')),
|
||||
'season_number': int_or_none(e.get('pl1$season')),
|
||||
'series': e.get('pl1$show'),
|
||||
}, {
|
||||
'HLS': {
|
||||
'manifest': 'm3u',
|
||||
},
|
||||
'DesktopHLS Default': {
|
||||
'manifest': 'm3u',
|
||||
},
|
||||
'MP4 MBR': {
|
||||
'manifest': 'm3u',
|
||||
},
|
||||
}, feed_info['account_id'])
|
@ -20,7 +20,7 @@ class CoubIE(InfoExtractor):
|
||||
'id': '5u5n1',
|
||||
'ext': 'mp4',
|
||||
'title': 'The Matrix Moonwalk',
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
'duration': 4.6,
|
||||
'timestamp': 1428527772,
|
||||
'upload_date': '20150408',
|
||||
|
@ -6,7 +6,8 @@ from ..utils import int_or_none
|
||||
|
||||
|
||||
class CrackleIE(InfoExtractor):
|
||||
_VALID_URL = r'(?:crackle:|https?://(?:www\.)?crackle\.com/(?:playlist/\d+/|(?:[^/]+/)+))(?P<id>\d+)'
|
||||
_GEO_COUNTRIES = ['US']
|
||||
_VALID_URL = r'(?:crackle:|https?://(?:(?:www|m)\.)?crackle\.com/(?:playlist/\d+/|(?:[^/]+/)+))(?P<id>\d+)'
|
||||
_TEST = {
|
||||
'url': 'http://www.crackle.com/comedians-in-cars-getting-coffee/2498934',
|
||||
'info_dict': {
|
||||
@ -14,7 +15,7 @@ class CrackleIE(InfoExtractor):
|
||||
'ext': 'mp4',
|
||||
'title': 'Everybody Respects A Bloody Nose',
|
||||
'description': 'Jerry is kaffeeklatsching in L.A. with funnyman J.B. Smoove (Saturday Night Live, Real Husbands of Hollywood). They’re headed for brew at 10 Speed Coffee in a 1964 Studebaker Avanti.',
|
||||
'thumbnail': 're:^https?://.*\.jpg',
|
||||
'thumbnail': r're:^https?://.*\.jpg',
|
||||
'duration': 906,
|
||||
'series': 'Comedians In Cars Getting Coffee',
|
||||
'season_number': 8,
|
||||
@ -31,8 +32,32 @@ class CrackleIE(InfoExtractor):
|
||||
}
|
||||
}
|
||||
|
||||
_THUMBNAIL_RES = [
|
||||
(120, 90),
|
||||
(208, 156),
|
||||
(220, 124),
|
||||
(220, 220),
|
||||
(240, 180),
|
||||
(250, 141),
|
||||
(315, 236),
|
||||
(320, 180),
|
||||
(360, 203),
|
||||
(400, 300),
|
||||
(421, 316),
|
||||
(460, 330),
|
||||
(460, 460),
|
||||
(462, 260),
|
||||
(480, 270),
|
||||
(587, 330),
|
||||
(640, 480),
|
||||
(700, 330),
|
||||
(700, 394),
|
||||
(854, 480),
|
||||
(1024, 1024),
|
||||
(1920, 1080),
|
||||
]
|
||||
|
||||
# extracted from http://legacyweb-us.crackle.com/flash/ReferrerRedirect.ashx
|
||||
_THUMBNAIL_TEMPLATE = 'http://images-us-am.crackle.com/%stnl_1920x1080.jpg?ts=20140107233116?c=635333335057637614'
|
||||
_MEDIA_FILE_SLOTS = {
|
||||
'c544.flv': {
|
||||
'width': 544,
|
||||
@ -61,17 +86,25 @@ class CrackleIE(InfoExtractor):
|
||||
|
||||
item = self._download_xml(
|
||||
'http://legacyweb-us.crackle.com/app/revamp/vidwallcache.aspx?flags=-1&fm=%s' % video_id,
|
||||
video_id).find('i')
|
||||
video_id, headers=self.geo_verification_headers()).find('i')
|
||||
title = item.attrib['t']
|
||||
|
||||
subtitles = {}
|
||||
formats = self._extract_m3u8_formats(
|
||||
'http://content.uplynk.com/ext/%s/%s.m3u8' % (config_doc.attrib['strUplynkOwnerId'], video_id),
|
||||
video_id, 'mp4', m3u8_id='hls', fatal=None)
|
||||
thumbnail = None
|
||||
thumbnails = []
|
||||
path = item.attrib.get('p')
|
||||
if path:
|
||||
thumbnail = self._THUMBNAIL_TEMPLATE % path
|
||||
for width, height in self._THUMBNAIL_RES:
|
||||
res = '%dx%d' % (width, height)
|
||||
thumbnails.append({
|
||||
'id': res,
|
||||
'url': 'http://images-us-am.crackle.com/%stnl_%s.jpg' % (path, res),
|
||||
'width': width,
|
||||
'height': height,
|
||||
'resolution': res,
|
||||
})
|
||||
http_base_url = 'http://ahttp.crackle.com/' + path
|
||||
for mfs_path, mfs_info in self._MEDIA_FILE_SLOTS.items():
|
||||
formats.append({
|
||||
@ -86,10 +119,11 @@ class CrackleIE(InfoExtractor):
|
||||
if locale and v:
|
||||
if locale not in subtitles:
|
||||
subtitles[locale] = []
|
||||
subtitles[locale] = [{
|
||||
'url': '%s/%s%s_%s.xml' % (config_doc.attrib['strSubtitleServer'], path, locale, v),
|
||||
'ext': 'ttml',
|
||||
}]
|
||||
for url_ext, ext in (('vtt', 'vtt'), ('xml', 'tt')):
|
||||
subtitles.setdefault(locale, []).append({
|
||||
'url': '%s/%s%s_%s.%s' % (config_doc.attrib['strSubtitleServer'], path, locale, v, url_ext),
|
||||
'ext': ext,
|
||||
})
|
||||
self._sort_formats(formats, ('width', 'height', 'tbr', 'format_id'))
|
||||
|
||||
return {
|
||||
@ -100,7 +134,7 @@ class CrackleIE(InfoExtractor):
|
||||
'series': item.attrib.get('sn'),
|
||||
'season_number': int_or_none(item.attrib.get('se')),
|
||||
'episode_number': int_or_none(item.attrib.get('ep')),
|
||||
'thumbnail': thumbnail,
|
||||
'thumbnails': thumbnails,
|
||||
'subtitles': subtitles,
|
||||
'formats': formats,
|
||||
}
|
||||
|
@ -14,7 +14,7 @@ class CriterionIE(InfoExtractor):
|
||||
'ext': 'mp4',
|
||||
'title': 'Le Samouraï',
|
||||
'description': 'md5:a2b4b116326558149bef81f76dcbb93f',
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -16,7 +16,7 @@ class CrooksAndLiarsIE(InfoExtractor):
|
||||
'ext': 'mp4',
|
||||
'title': 'Fox & Friends Says Protecting Atheists From Discrimination Is Anti-Christian!',
|
||||
'description': 'md5:e1a46ad1650e3a5ec7196d432799127f',
|
||||
'thumbnail': 're:^https?://.*\.jpg',
|
||||
'thumbnail': r're:^https?://.*\.jpg',
|
||||
'timestamp': 1428207000,
|
||||
'upload_date': '20150405',
|
||||
'uploader': 'Heather',
|
||||
|
@ -123,7 +123,7 @@ class CrunchyrollIE(CrunchyrollBaseIE):
|
||||
'url': 'http://www.crunchyroll.com/wanna-be-the-strongest-in-the-world/episode-1-an-idol-wrestler-is-born-645513',
|
||||
'info_dict': {
|
||||
'id': '645513',
|
||||
'ext': 'flv',
|
||||
'ext': 'mp4',
|
||||
'title': 'Wanna be the Strongest in the World Episode 1 – An Idol-Wrestler is Born!',
|
||||
'description': 'md5:2d17137920c64f2f49981a7797d275ef',
|
||||
'thumbnail': 'http://img1.ak.crunchyroll.com/i/spire1-tmb/20c6b5e10f1a47b10516877d3c039cae1380951166_full.jpg',
|
||||
@ -142,7 +142,7 @@ class CrunchyrollIE(CrunchyrollBaseIE):
|
||||
'ext': 'flv',
|
||||
'title': 'Culture Japan Episode 1 – Rebuilding Japan after the 3.11',
|
||||
'description': 'md5:2fbc01f90b87e8e9137296f37b461c12',
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
'uploader': 'Danny Choo Network',
|
||||
'upload_date': '20120213',
|
||||
},
|
||||
@ -158,7 +158,7 @@ class CrunchyrollIE(CrunchyrollBaseIE):
|
||||
'ext': 'mp4',
|
||||
'title': 'Re:ZERO -Starting Life in Another World- Episode 5 – The Morning of Our Promise Is Still Distant',
|
||||
'description': 'md5:97664de1ab24bbf77a9c01918cb7dca9',
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
'uploader': 'TV TOKYO',
|
||||
'upload_date': '20160508',
|
||||
},
|
||||
@ -166,6 +166,25 @@ class CrunchyrollIE(CrunchyrollBaseIE):
|
||||
# m3u8 download
|
||||
'skip_download': True,
|
||||
},
|
||||
}, {
|
||||
'url': 'http://www.crunchyroll.com/konosuba-gods-blessing-on-this-wonderful-world/episode-1-give-me-deliverance-from-this-judicial-injustice-727589',
|
||||
'info_dict': {
|
||||
'id': '727589',
|
||||
'ext': 'mp4',
|
||||
'title': "KONOSUBA -God's blessing on this wonderful world! 2 Episode 1 – Give Me Deliverance from this Judicial Injustice!",
|
||||
'description': 'md5:cbcf05e528124b0f3a0a419fc805ea7d',
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
'uploader': 'Kadokawa Pictures Inc.',
|
||||
'upload_date': '20170118',
|
||||
'series': "KONOSUBA -God's blessing on this wonderful world!",
|
||||
'season_number': 2,
|
||||
'episode': 'Give Me Deliverance from this Judicial Injustice!',
|
||||
'episode_number': 1,
|
||||
},
|
||||
'params': {
|
||||
# m3u8 download
|
||||
'skip_download': True,
|
||||
},
|
||||
}, {
|
||||
'url': 'http://www.crunchyroll.fr/girl-friend-beta/episode-11-goodbye-la-mode-661697',
|
||||
'only_matching': True,
|
||||
@ -173,6 +192,36 @@ class CrunchyrollIE(CrunchyrollBaseIE):
|
||||
# geo-restricted (US), 18+ maturity wall, non-premium available
|
||||
'url': 'http://www.crunchyroll.com/cosplay-complex-ova/episode-1-the-birth-of-the-cosplay-club-565617',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
# A description with double quotes
|
||||
'url': 'http://www.crunchyroll.com/11eyes/episode-1-piros-jszaka-red-night-535080',
|
||||
'info_dict': {
|
||||
'id': '535080',
|
||||
'ext': 'mp4',
|
||||
'title': '11eyes Episode 1 – Piros éjszaka - Red Night',
|
||||
'description': 'Kakeru and Yuka are thrown into an alternate nightmarish world they call "Red Night".',
|
||||
'uploader': 'Marvelous AQL Inc.',
|
||||
'upload_date': '20091021',
|
||||
},
|
||||
'params': {
|
||||
# Just test metadata extraction
|
||||
'skip_download': True,
|
||||
},
|
||||
}, {
|
||||
# make sure we can extract an uploader name that's not a link
|
||||
'url': 'http://www.crunchyroll.com/hakuoki-reimeiroku/episode-1-dawn-of-the-divine-warriors-606899',
|
||||
'info_dict': {
|
||||
'id': '606899',
|
||||
'ext': 'mp4',
|
||||
'title': 'Hakuoki Reimeiroku Episode 1 – Dawn of the Divine Warriors',
|
||||
'description': 'Ryunosuke was left to die, but Serizawa-san asked him a simple question "Do you want to live?"',
|
||||
'uploader': 'Geneon Entertainment',
|
||||
'upload_date': '20120717',
|
||||
},
|
||||
'params': {
|
||||
# just test metadata extraction
|
||||
'skip_download': True,
|
||||
},
|
||||
}]
|
||||
|
||||
_FORMAT_IDS = {
|
||||
@ -236,8 +285,7 @@ class CrunchyrollIE(CrunchyrollBaseIE):
|
||||
output += 'WrapStyle: %s\n' % sub_root.attrib['wrap_style']
|
||||
output += 'PlayResX: %s\n' % sub_root.attrib['play_res_x']
|
||||
output += 'PlayResY: %s\n' % sub_root.attrib['play_res_y']
|
||||
output += """ScaledBorderAndShadow: no
|
||||
|
||||
output += """
|
||||
[V4+ Styles]
|
||||
Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, Encoding
|
||||
"""
|
||||
@ -344,9 +392,9 @@ Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
|
||||
r'(?s)<h1[^>]*>((?:(?!<h1).)*?<span[^>]+itemprop=["\']title["\'][^>]*>(?:(?!<h1).)+?)</h1>',
|
||||
webpage, 'video_title')
|
||||
video_title = re.sub(r' {2,}', ' ', video_title)
|
||||
video_description = self._html_search_regex(
|
||||
r'<script[^>]*>\s*.+?\[media_id=%s\].+?"description"\s*:\s*"([^"]+)' % video_id,
|
||||
webpage, 'description', default=None)
|
||||
video_description = self._parse_json(self._html_search_regex(
|
||||
r'<script[^>]*>\s*.+?\[media_id=%s\].+?({.+?"description"\s*:.+?})\);' % video_id,
|
||||
webpage, 'description', default='{}'), video_id).get('description')
|
||||
if video_description:
|
||||
video_description = lowercase_escape(video_description.replace(r'\r\n', '\n'))
|
||||
video_upload_date = self._html_search_regex(
|
||||
@ -355,8 +403,9 @@ Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
|
||||
if video_upload_date:
|
||||
video_upload_date = unified_strdate(video_upload_date)
|
||||
video_uploader = self._html_search_regex(
|
||||
r'<a[^>]+href="/publisher/[^"]+"[^>]*>([^<]+)</a>', webpage,
|
||||
'video_uploader', fatal=False)
|
||||
# try looking for both an uploader that's a link and one that's not
|
||||
[r'<a[^>]+href="/publisher/[^"]+"[^>]*>([^<]+)</a>', r'<div>\s*Publisher:\s*<span>\s*(.+?)\s*</span>\s*</div>'],
|
||||
webpage, 'video_uploader', fatal=False)
|
||||
|
||||
available_fmts = []
|
||||
for a, fmt in re.findall(r'(<a[^>]+token=["\']showmedia\.([0-9]{3,4})p["\'][^>]+>)', webpage):
|
||||
@ -439,6 +488,18 @@ Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
|
||||
|
||||
subtitles = self.extract_subtitles(video_id, webpage)
|
||||
|
||||
# webpage provide more accurate data than series_title from XML
|
||||
series = self._html_search_regex(
|
||||
r'id=["\']showmedia_about_episode_num[^>]+>\s*<a[^>]+>([^<]+)',
|
||||
webpage, 'series', default=xpath_text(metadata, 'series_title'))
|
||||
|
||||
episode = xpath_text(metadata, 'episode_title')
|
||||
episode_number = int_or_none(xpath_text(metadata, 'episode_number'))
|
||||
|
||||
season_number = int_or_none(self._search_regex(
|
||||
r'(?s)<h4[^>]+id=["\']showmedia_about_episode_num[^>]+>.+?</h4>\s*<h4>\s*Season (\d+)',
|
||||
webpage, 'season number', default=None))
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': video_title,
|
||||
@ -446,9 +507,10 @@ Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
|
||||
'thumbnail': xpath_text(metadata, 'episode_image_url'),
|
||||
'uploader': video_uploader,
|
||||
'upload_date': video_upload_date,
|
||||
'series': xpath_text(metadata, 'series_title'),
|
||||
'episode': xpath_text(metadata, 'episode_title'),
|
||||
'episode_number': int_or_none(xpath_text(metadata, 'episode_number')),
|
||||
'series': series,
|
||||
'season_number': season_number,
|
||||
'episode': episode,
|
||||
'episode_number': episode_number,
|
||||
'subtitles': subtitles,
|
||||
'formats': formats,
|
||||
}
|
||||
@ -488,11 +550,11 @@ class CrunchyrollShowPlaylistIE(CrunchyrollBaseIE):
|
||||
r'(?s)<h1[^>]*>\s*<span itemprop="name">(.*?)</span>',
|
||||
webpage, 'title')
|
||||
episode_paths = re.findall(
|
||||
r'(?s)<li id="showview_videos_media_[0-9]+"[^>]+>.*?<a href="([^"]+)"',
|
||||
r'(?s)<li id="showview_videos_media_(\d+)"[^>]+>.*?<a href="([^"]+)"',
|
||||
webpage)
|
||||
entries = [
|
||||
self.url_result('http://www.crunchyroll.com' + ep, 'Crunchyroll')
|
||||
for ep in episode_paths
|
||||
self.url_result('http://www.crunchyroll.com' + ep, 'Crunchyroll', ep_id)
|
||||
for ep_id, ep in episode_paths
|
||||
]
|
||||
entries.reverse()
|
||||
|
||||
|
@ -12,6 +12,7 @@ from ..utils import (
|
||||
ExtractorError,
|
||||
)
|
||||
from .senateisvp import SenateISVPIE
|
||||
from .ustream import UstreamIE
|
||||
|
||||
|
||||
class CSpanIE(InfoExtractor):
|
||||
@ -22,14 +23,13 @@ class CSpanIE(InfoExtractor):
|
||||
'md5': '94b29a4f131ff03d23471dd6f60b6a1d',
|
||||
'info_dict': {
|
||||
'id': '315139',
|
||||
'ext': 'mp4',
|
||||
'title': 'Attorney General Eric Holder on Voting Rights Act Decision',
|
||||
'description': 'Attorney General Eric Holder speaks to reporters following the Supreme Court decision in [Shelby County v. Holder], in which the court ruled that the preclearance provisions of the Voting Rights Act could not be enforced.',
|
||||
},
|
||||
'playlist_mincount': 2,
|
||||
'skip': 'Regularly fails on travis, for unknown reasons',
|
||||
}, {
|
||||
'url': 'http://www.c-span.org/video/?c4486943/cspan-international-health-care-models',
|
||||
'md5': '8e5fbfabe6ad0f89f3012a7943c1287b',
|
||||
# md5 is unstable
|
||||
'info_dict': {
|
||||
'id': 'c4486943',
|
||||
'ext': 'mp4',
|
||||
@ -38,14 +38,11 @@ class CSpanIE(InfoExtractor):
|
||||
}
|
||||
}, {
|
||||
'url': 'http://www.c-span.org/video/?318608-1/gm-ignition-switch-recall',
|
||||
'md5': '2ae5051559169baadba13fc35345ae74',
|
||||
'info_dict': {
|
||||
'id': '342759',
|
||||
'ext': 'mp4',
|
||||
'title': 'General Motors Ignition Switch Recall',
|
||||
'duration': 14848,
|
||||
'description': 'md5:118081aedd24bf1d3b68b3803344e7f3'
|
||||
},
|
||||
'playlist_mincount': 6,
|
||||
}, {
|
||||
# Video from senate.gov
|
||||
'url': 'http://www.c-span.org/video/?104517-1/immigration-reforms-needed-protect-skilled-american-workers',
|
||||
@ -57,12 +54,30 @@ class CSpanIE(InfoExtractor):
|
||||
'params': {
|
||||
'skip_download': True, # m3u8 downloads
|
||||
}
|
||||
}, {
|
||||
# Ustream embedded video
|
||||
'url': 'https://www.c-span.org/video/?114917-1/armed-services',
|
||||
'info_dict': {
|
||||
'id': '58428542',
|
||||
'ext': 'flv',
|
||||
'title': 'USHR07 Armed Services Committee',
|
||||
'description': 'hsas00-2118-20150204-1000et-07\n\n\nUSHR07 Armed Services Committee',
|
||||
'timestamp': 1423060374,
|
||||
'upload_date': '20150204',
|
||||
'uploader': 'HouseCommittee',
|
||||
'uploader_id': '12987475',
|
||||
},
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
video_type = None
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
ustream_url = UstreamIE._extract_url(webpage)
|
||||
if ustream_url:
|
||||
return self.url_result(ustream_url, UstreamIE.ie_key())
|
||||
|
||||
# We first look for clipid, because clipprog always appears before
|
||||
patterns = [r'id=\'clip(%s)\'\s*value=\'([0-9]+)\'' % t for t in ('id', 'prog')]
|
||||
results = list(filter(None, (re.search(p, webpage) for p in patterns)))
|
||||
|
@ -28,7 +28,7 @@ class CtsNewsIE(InfoExtractor):
|
||||
'ext': 'mp4',
|
||||
'title': '韓國31歲童顏男 貌如十多歲小孩',
|
||||
'description': '越有年紀的人,越希望看起來年輕一點,而南韓卻有一位31歲的男子,看起來像是11、12歲的小孩,身...',
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
'timestamp': 1378205880,
|
||||
'upload_date': '20130903',
|
||||
}
|
||||
@ -41,7 +41,7 @@ class CtsNewsIE(InfoExtractor):
|
||||
'ext': 'mp4',
|
||||
'title': 'iPhone6熱銷 蘋果財報亮眼',
|
||||
'description': 'md5:f395d4f485487bb0f992ed2c4b07aa7d',
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
'upload_date': '20150128',
|
||||
'uploader_id': 'TBSCTS',
|
||||
'uploader': '中華電視公司',
|
||||
|
@ -21,7 +21,7 @@ class CultureUnpluggedIE(InfoExtractor):
|
||||
'ext': 'mp4',
|
||||
'title': 'The Next, Best West',
|
||||
'description': 'md5:0423cd00833dea1519cf014e9d0903b1',
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
'creator': 'Coldstream Creative',
|
||||
'duration': 2203,
|
||||
'view_count': int,
|
||||
|
@ -58,7 +58,7 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
|
||||
'ext': 'mp4',
|
||||
'title': 'Steam Machine Models, Pricing Listed on Steam Store - IGN News',
|
||||
'description': 'Several come bundled with the Steam Controller.',
|
||||
'thumbnail': 're:^https?:.*\.(?:jpg|png)$',
|
||||
'thumbnail': r're:^https?:.*\.(?:jpg|png)$',
|
||||
'duration': 74,
|
||||
'timestamp': 1425657362,
|
||||
'upload_date': '20150306',
|
||||
@ -66,7 +66,6 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
|
||||
'uploader_id': 'xijv66',
|
||||
'age_limit': 0,
|
||||
'view_count': int,
|
||||
'comment_count': int,
|
||||
}
|
||||
},
|
||||
# Vevo video
|
||||
@ -140,7 +139,7 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
|
||||
view_count = str_to_int(view_count_str)
|
||||
comment_count = int_or_none(self._search_regex(
|
||||
r'<meta[^>]+itemprop="interactionCount"[^>]+content="UserComments:(\d+)"',
|
||||
webpage, 'comment count', fatal=False))
|
||||
webpage, 'comment count', default=None))
|
||||
|
||||
player_v5 = self._search_regex(
|
||||
[r'buildPlayer\(({.+?})\);\n', # See https://github.com/rg3/youtube-dl/issues/7826
|
||||
@ -283,9 +282,14 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
|
||||
}
|
||||
|
||||
def _check_error(self, info):
|
||||
error = info.get('error')
|
||||
if info.get('error') is not None:
|
||||
title = error['title']
|
||||
# See https://developer.dailymotion.com/api#access-error
|
||||
if error.get('code') == 'DM007':
|
||||
self.raise_geo_restricted(msg=title)
|
||||
raise ExtractorError(
|
||||
'%s said: %s' % (self.IE_NAME, info['error']['title']), expected=True)
|
||||
'%s said: %s' % (self.IE_NAME, title), expected=True)
|
||||
|
||||
def _get_subtitles(self, video_id, webpage):
|
||||
try:
|
||||
|
159
youtube_dl/extractor/daisuki.py
Normal file
159
youtube_dl/extractor/daisuki.py
Normal file
@ -0,0 +1,159 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import base64
|
||||
import json
|
||||
import random
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..aes import (
|
||||
aes_cbc_decrypt,
|
||||
aes_cbc_encrypt,
|
||||
)
|
||||
from ..utils import (
|
||||
bytes_to_intlist,
|
||||
bytes_to_long,
|
||||
clean_html,
|
||||
ExtractorError,
|
||||
intlist_to_bytes,
|
||||
get_element_by_id,
|
||||
js_to_json,
|
||||
int_or_none,
|
||||
long_to_bytes,
|
||||
pkcs1pad,
|
||||
remove_end,
|
||||
)
|
||||
|
||||
|
||||
class DaisukiIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?daisuki\.net/[^/]+/[^/]+/[^/]+/watch\.[^.]+\.(?P<id>\d+)\.html'
|
||||
|
||||
_TEST = {
|
||||
'url': 'http://www.daisuki.net/tw/en/anime/watch.TheIdolMasterCG.11213.html',
|
||||
'info_dict': {
|
||||
'id': '11213',
|
||||
'ext': 'mp4',
|
||||
'title': '#01 Who is in the pumpkin carriage? - THE IDOLM@STER CINDERELLA GIRLS',
|
||||
'subtitles': {
|
||||
'mul': [{
|
||||
'ext': 'ttml',
|
||||
}],
|
||||
},
|
||||
'creator': 'BANDAI NAMCO Entertainment',
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True, # AES-encrypted HLS stream
|
||||
},
|
||||
}
|
||||
|
||||
# The public key in PEM format can be found in clientlibs_anime_watch.min.js
|
||||
_RSA_KEY = (0xc5524c25e8e14b366b3754940beeb6f96cb7e2feef0b932c7659a0c5c3bf173d602464c2df73d693b513ae06ff1be8f367529ab30bf969c5640522181f2a0c51ea546ae120d3d8d908595e4eff765b389cde080a1ef7f1bbfb07411cc568db73b7f521cedf270cbfbe0ddbc29b1ac9d0f2d8f4359098caffee6d07915020077d, 65537)
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
flashvars = self._parse_json(self._search_regex(
|
||||
r'(?s)var\s+flashvars\s*=\s*({.+?});', webpage, 'flashvars'),
|
||||
video_id, transform_source=js_to_json)
|
||||
|
||||
iv = [0] * 16
|
||||
|
||||
data = {}
|
||||
for key in ('device_cd', 'mv_id', 'ss1_prm', 'ss2_prm', 'ss3_prm', 'ss_id'):
|
||||
data[key] = flashvars.get(key, '')
|
||||
|
||||
encrypted_rtn = None
|
||||
|
||||
# Some AES keys are rejected. Try it with different AES keys
|
||||
for idx in range(5):
|
||||
aes_key = [random.randint(0, 254) for _ in range(32)]
|
||||
padded_aeskey = intlist_to_bytes(pkcs1pad(aes_key, 128))
|
||||
|
||||
n, e = self._RSA_KEY
|
||||
encrypted_aeskey = long_to_bytes(pow(bytes_to_long(padded_aeskey), e, n))
|
||||
init_data = self._download_json('http://www.daisuki.net/bin/bgn/init', video_id, query={
|
||||
's': flashvars.get('s', ''),
|
||||
'c': flashvars.get('ss3_prm', ''),
|
||||
'e': url,
|
||||
'd': base64.b64encode(intlist_to_bytes(aes_cbc_encrypt(
|
||||
bytes_to_intlist(json.dumps(data)),
|
||||
aes_key, iv))).decode('ascii'),
|
||||
'a': base64.b64encode(encrypted_aeskey).decode('ascii'),
|
||||
}, note='Downloading JSON metadata' + (' (try #%d)' % (idx + 1) if idx > 0 else ''))
|
||||
|
||||
if 'rtn' in init_data:
|
||||
encrypted_rtn = init_data['rtn']
|
||||
break
|
||||
|
||||
self._sleep(5, video_id)
|
||||
|
||||
if encrypted_rtn is None:
|
||||
raise ExtractorError('Failed to fetch init data')
|
||||
|
||||
rtn = self._parse_json(
|
||||
intlist_to_bytes(aes_cbc_decrypt(bytes_to_intlist(
|
||||
base64.b64decode(encrypted_rtn)),
|
||||
aes_key, iv)).decode('utf-8').rstrip('\0'),
|
||||
video_id)
|
||||
|
||||
formats = self._extract_m3u8_formats(
|
||||
rtn['play_url'], video_id, ext='mp4', entry_protocol='m3u8_native')
|
||||
|
||||
title = remove_end(self._og_search_title(webpage), ' - DAISUKI')
|
||||
|
||||
creator = self._html_search_regex(
|
||||
r'Creator\s*:\s*([^<]+)', webpage, 'creator', fatal=False)
|
||||
|
||||
subtitles = {}
|
||||
caption_url = rtn.get('caption_url')
|
||||
if caption_url:
|
||||
# mul: multiple languages
|
||||
subtitles['mul'] = [{
|
||||
'url': caption_url,
|
||||
'ext': 'ttml',
|
||||
}]
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': title,
|
||||
'formats': formats,
|
||||
'subtitles': subtitles,
|
||||
'creator': creator,
|
||||
}
|
||||
|
||||
|
||||
class DaisukiPlaylistIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)daisuki\.net/[^/]+/[^/]+/[^/]+/detail\.(?P<id>[a-zA-Z0-9]+)\.html'
|
||||
|
||||
_TEST = {
|
||||
'url': 'http://www.daisuki.net/tw/en/anime/detail.TheIdolMasterCG.html',
|
||||
'info_dict': {
|
||||
'id': 'TheIdolMasterCG',
|
||||
'title': 'THE IDOLM@STER CINDERELLA GIRLS',
|
||||
'description': 'md5:0f2c028a9339f7a2c7fbf839edc5c5d8',
|
||||
},
|
||||
'playlist_count': 26,
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
playlist_id = self._match_id(url)
|
||||
|
||||
webpage = self._download_webpage(url, playlist_id)
|
||||
|
||||
episode_pattern = r'''(?sx)
|
||||
<img[^>]+delay="[^"]+/(\d+)/movie\.jpg".+?
|
||||
<p[^>]+class=".*?\bepisodeNumber\b.*?">(?:<a[^>]+>)?([^<]+)'''
|
||||
entries = [{
|
||||
'_type': 'url_transparent',
|
||||
'url': url.replace('detail', 'watch').replace('.html', '.' + movie_id + '.html'),
|
||||
'episode_id': episode_id,
|
||||
'episode_number': int_or_none(episode_id),
|
||||
} for movie_id, episode_id in re.findall(episode_pattern, webpage)]
|
||||
|
||||
playlist_title = remove_end(
|
||||
self._og_search_title(webpage, fatal=False), ' - Anime - DAISUKI')
|
||||
playlist_description = clean_html(get_element_by_id('synopsisTxt', webpage))
|
||||
|
||||
return self.playlist_result(entries, playlist_id, playlist_title, playlist_description)
|
@ -32,7 +32,7 @@ class DaumIE(InfoExtractor):
|
||||
'title': '마크 헌트 vs 안토니오 실바',
|
||||
'description': 'Mark Hunt vs Antonio Silva',
|
||||
'upload_date': '20131217',
|
||||
'thumbnail': 're:^https?://.*\.(?:jpg|png)',
|
||||
'thumbnail': r're:^https?://.*\.(?:jpg|png)',
|
||||
'duration': 2117,
|
||||
'view_count': int,
|
||||
'comment_count': int,
|
||||
@ -45,7 +45,7 @@ class DaumIE(InfoExtractor):
|
||||
'title': '1297회, \'아빠 아들로 태어나길 잘 했어\' 민수, 감동의 눈물[아빠 어디가] 20150118',
|
||||
'description': 'md5:79794514261164ff27e36a21ad229fc5',
|
||||
'upload_date': '20150604',
|
||||
'thumbnail': 're:^https?://.*\.(?:jpg|png)',
|
||||
'thumbnail': r're:^https?://.*\.(?:jpg|png)',
|
||||
'duration': 154,
|
||||
'view_count': int,
|
||||
'comment_count': int,
|
||||
@ -61,7 +61,7 @@ class DaumIE(InfoExtractor):
|
||||
'title': '01-Korean War ( Trouble on the horizon )',
|
||||
'description': '\nKorean War 01\nTrouble on the horizon\n전쟁의 먹구름',
|
||||
'upload_date': '20080223',
|
||||
'thumbnail': 're:^https?://.*\.(?:jpg|png)',
|
||||
'thumbnail': r're:^https?://.*\.(?:jpg|png)',
|
||||
'duration': 249,
|
||||
'view_count': int,
|
||||
'comment_count': int,
|
||||
@ -139,7 +139,7 @@ class DaumClipIE(InfoExtractor):
|
||||
'title': 'DOTA 2GETHER 시즌2 6회 - 2부',
|
||||
'description': 'DOTA 2GETHER 시즌2 6회 - 2부',
|
||||
'upload_date': '20130831',
|
||||
'thumbnail': 're:^https?://.*\.(?:jpg|png)',
|
||||
'thumbnail': r're:^https?://.*\.(?:jpg|png)',
|
||||
'duration': 3868,
|
||||
'view_count': int,
|
||||
},
|
||||
|
@ -17,7 +17,7 @@ class DBTVIE(InfoExtractor):
|
||||
'ext': 'mp4',
|
||||
'title': 'Skulle teste ut fornøyelsespark, men kollegaen var bare opptatt av bikinikroppen',
|
||||
'description': 'md5:1504a54606c4dde3e4e61fc97aa857e0',
|
||||
'thumbnail': 're:https?://.*\.jpg',
|
||||
'thumbnail': r're:https?://.*\.jpg',
|
||||
'timestamp': 1404039863,
|
||||
'upload_date': '20140629',
|
||||
'duration': 69.544,
|
||||
|
@ -17,7 +17,7 @@ class DctpTvIE(InfoExtractor):
|
||||
'title': 'Videoinstallation für eine Kaufhausfassade',
|
||||
'description': 'Kurzfilm',
|
||||
'upload_date': '20110407',
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
},
|
||||
}
|
||||
|
||||
|
@ -19,7 +19,7 @@ class DeezerPlaylistIE(InfoExtractor):
|
||||
'id': '176747451',
|
||||
'title': 'Best!',
|
||||
'uploader': 'Anonymous',
|
||||
'thumbnail': 're:^https?://cdn-images.deezer.com/images/cover/.*\.jpg$',
|
||||
'thumbnail': r're:^https?://cdn-images.deezer.com/images/cover/.*\.jpg$',
|
||||
},
|
||||
'playlist_count': 30,
|
||||
'skip': 'Only available in .de',
|
||||
|
@ -17,7 +17,7 @@ class DHMIE(InfoExtractor):
|
||||
'title': 'MARSHALL PLAN AT WORK IN WESTERN GERMANY, THE',
|
||||
'description': 'md5:1fabd480c153f97b07add61c44407c82',
|
||||
'duration': 660,
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
},
|
||||
}, {
|
||||
'url': 'http://www.dhm.de/filmarchiv/02-mapping-the-wall/peter-g/rolle-1/',
|
||||
@ -26,7 +26,7 @@ class DHMIE(InfoExtractor):
|
||||
'id': 'rolle-1',
|
||||
'ext': 'flv',
|
||||
'title': 'ROLLE 1',
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
},
|
||||
}]
|
||||
|
||||
|
@ -36,7 +36,7 @@ class DigitekaIE(InfoExtractor):
|
||||
'id': 's8uk0r',
|
||||
'ext': 'mp4',
|
||||
'title': 'Loi sur la fin de vie: le texte prévoit un renforcement des directives anticipées',
|
||||
'thumbnail': 're:^https?://.*\.jpg',
|
||||
'thumbnail': r're:^https?://.*\.jpg',
|
||||
'duration': 74,
|
||||
'upload_date': '20150317',
|
||||
'timestamp': 1426604939,
|
||||
@ -50,7 +50,7 @@ class DigitekaIE(InfoExtractor):
|
||||
'id': 'xvpfp8',
|
||||
'ext': 'mp4',
|
||||
'title': 'Two - C\'est La Vie (clip)',
|
||||
'thumbnail': 're:^https?://.*\.jpg',
|
||||
'thumbnail': r're:^https?://.*\.jpg',
|
||||
'duration': 233,
|
||||
'upload_date': '20150224',
|
||||
'timestamp': 1424760500,
|
||||
|
@ -6,7 +6,6 @@ from ..utils import (
|
||||
extract_attributes,
|
||||
int_or_none,
|
||||
parse_age_limit,
|
||||
unescapeHTML,
|
||||
ExtractorError,
|
||||
)
|
||||
|
||||
@ -49,7 +48,7 @@ class DiscoveryGoIE(InfoExtractor):
|
||||
webpage, 'video container'))
|
||||
|
||||
video = self._parse_json(
|
||||
unescapeHTML(container.get('data-video') or container.get('data-json')),
|
||||
container.get('data-video') or container.get('data-json'),
|
||||
display_id)
|
||||
|
||||
title = video['name']
|
||||
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue
Block a user