1
0
mirror of https://github.com/l1ving/youtube-dl synced 2025-01-24 17:56:07 +08:00

Merge pull request #1 from rg3/master

Update from original
This commit is contained in:
rogerchang23424 2017-09-14 18:21:28 +08:00 committed by GitHub
commit 61c5f04993
27 changed files with 611 additions and 233 deletions

View File

@ -6,8 +6,8 @@
---
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2017.09.02*. If it's not, read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2017.09.02**
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2017.09.11*. If it's not, read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2017.09.11**
### Before submitting an *issue* make sure you have:
- [ ] At least skimmed through the [README](https://github.com/rg3/youtube-dl/blob/master/README.md), **most notably** the [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
@ -35,7 +35,7 @@ Add the `-v` flag to **your command line** you run youtube-dl with (`youtube-dl
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version 2017.09.02
[debug] youtube-dl version 2017.09.11
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}

View File

@ -82,6 +82,8 @@ To run the test, simply invoke your favorite test runner, or execute a test file
python test/test_download.py
nosetests
See item 6 of [new extractor tutorial](#adding-support-for-a-new-site) for how to run extractor specific test cases.
If you want to create a build of youtube-dl yourself, you'll need
* python
@ -149,7 +151,7 @@ After you have ensured this site is distributing its content legally, you can fo
}
```
5. Add an import in [`youtube_dl/extractor/extractors.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/extractors.py).
6. Run `python test/test_download.py TestDownload.test_YourExtractor`. This *should fail* at first, but you can continually re-run it until you're done. If you decide to add more than one test, then rename ``_TEST`` to ``_TESTS`` and make it into a list of dictionaries. The tests will then be named `TestDownload.test_YourExtractor`, `TestDownload.test_YourExtractor_1`, `TestDownload.test_YourExtractor_2`, etc.
6. Run `python test/test_download.py TestDownload.test_YourExtractor`. This *should fail* at first, but you can continually re-run it until you're done. If you decide to add more than one test, then rename ``_TEST`` to ``_TESTS`` and make it into a list of dictionaries. The tests will then be named `TestDownload.test_YourExtractor`, `TestDownload.test_YourExtractor_1`, `TestDownload.test_YourExtractor_2`, etc. Note that tests with `only_matching` key in test's dict are not counted in.
7. Have a look at [`youtube_dl/extractor/common.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py) for possible helper methods and a [detailed description of what your extractor should and may return](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py#L74-L252). Add tests and code for as many as you want.
8. Make sure your code follows [youtube-dl coding conventions](#youtube-dl-coding-conventions) and check the code with [flake8](https://pypi.python.org/pypi/flake8). Also make sure your code works under all [Python](https://www.python.org/) versions claimed supported by youtube-dl, namely 2.6, 2.7, and 3.2+.
9. When the tests pass, [add](https://git-scm.com/docs/git-add) the new files and [commit](https://git-scm.com/docs/git-commit) them and [push](https://git-scm.com/docs/git-push) the result, like this:

View File

@ -1,3 +1,38 @@
version 2017.09.11
Extractors
* [rutube:playlist] Fix suitable (#14166)
version 2017.09.10
Core
+ [utils] Introduce bool_or_none
* [YoutubeDL] Ensure dir existence for each requested format (#14116)
Extractors
* [fox] Fix extraction (#14147)
* [rutube] Use bool_or_none
* [rutube] Rework and generalize playlist extractors (#13565)
+ [rutube:playlist] Add support for playlists (#13534, #13565)
+ [radiocanada] Add fallback for title extraction (#14145)
* [vk] Use dedicated YouTube embeds extraction routine
* [vice] Use dedicated YouTube embeds extraction routine
* [cracked] Use dedicated YouTube embeds extraction routine
* [chilloutzone] Use dedicated YouTube embeds extraction routine
* [abcnews] Use dedicated YouTube embeds extraction routine
* [youtube] Separate methods for embeds extraction
* [redtube] Fix formats extraction (#14122)
* [arte] Relax unavailability check (#14112)
+ [manyvids] Add support for preview videos from manyvids.com (#14053, #14059)
* [vidme:user] Relax URL regular expression (#14054)
* [bpb] Fix extraction (#14043, #14086)
* [soundcloud] Fix download URL with private tracks (#14093)
* [aliexpress:live] Add support for live.aliexpress.com (#13698, #13707)
* [viidea] Capture and output lecture error message (#14099)
* [radiocanada] Skip unsupported platforms (#14100)
version 2017.09.02
Extractors

View File

@ -936,6 +936,8 @@ To run the test, simply invoke your favorite test runner, or execute a test file
python test/test_download.py
nosetests
See item 6 of [new extractor tutorial](#adding-support-for-a-new-site) for how to run extractor specific test cases.
If you want to create a build of youtube-dl yourself, you'll need
* python
@ -1003,7 +1005,7 @@ After you have ensured this site is distributing its content legally, you can fo
}
```
5. Add an import in [`youtube_dl/extractor/extractors.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/extractors.py).
6. Run `python test/test_download.py TestDownload.test_YourExtractor`. This *should fail* at first, but you can continually re-run it until you're done. If you decide to add more than one test, then rename ``_TEST`` to ``_TESTS`` and make it into a list of dictionaries. The tests will then be named `TestDownload.test_YourExtractor`, `TestDownload.test_YourExtractor_1`, `TestDownload.test_YourExtractor_2`, etc.
6. Run `python test/test_download.py TestDownload.test_YourExtractor`. This *should fail* at first, but you can continually re-run it until you're done. If you decide to add more than one test, then rename ``_TEST`` to ``_TESTS`` and make it into a list of dictionaries. The tests will then be named `TestDownload.test_YourExtractor`, `TestDownload.test_YourExtractor_1`, `TestDownload.test_YourExtractor_2`, etc. Note that tests with `only_matching` key in test's dict are not counted in.
7. Have a look at [`youtube_dl/extractor/common.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py) for possible helper methods and a [detailed description of what your extractor should and may return](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py#L74-L252). Add tests and code for as many as you want.
8. Make sure your code follows [youtube-dl coding conventions](#youtube-dl-coding-conventions) and check the code with [flake8](https://pypi.python.org/pypi/flake8). Also make sure your code works under all [Python](https://www.python.org/) versions claimed supported by youtube-dl, namely 2.6, 2.7, and 3.2+.
9. When the tests pass, [add](https://git-scm.com/docs/git-add) the new files and [commit](https://git-scm.com/docs/git-commit) them and [push](https://git-scm.com/docs/git-push) the result, like this:

View File

@ -38,6 +38,7 @@
- **afreecatv**: afreecatv.com
- **afreecatv:global**: afreecatv.com
- **AirMozilla**
- **AliExpressLive**
- **AlJazeera**
- **Allocine**
- **AlphaPorno**
@ -437,6 +438,7 @@
- **MakerTV**
- **mangomolo:live**
- **mangomolo:video**
- **ManyVids**
- **MatchTV**
- **MDR**: MDR.DE and KiKA
- **media.ccc.de**
@ -701,6 +703,7 @@
- **rutube:embed**: Rutube embedded videos
- **rutube:movie**: Rutube movies
- **rutube:person**: Rutube person videos
- **rutube:playlist**: Rutube playlists
- **RUTV**: RUTV.RU
- **Ruutu**
- **Ruv**

View File

@ -1710,12 +1710,17 @@ class YoutubeDL(object):
if filename is None:
return
try:
dn = os.path.dirname(sanitize_path(encodeFilename(filename)))
if dn and not os.path.exists(dn):
os.makedirs(dn)
except (OSError, IOError) as err:
self.report_error('unable to create directory ' + error_to_compat_str(err))
def ensure_dir_exists(path):
try:
dn = os.path.dirname(path)
if dn and not os.path.exists(dn):
os.makedirs(dn)
return True
except (OSError, IOError) as err:
self.report_error('unable to create directory ' + error_to_compat_str(err))
return False
if not ensure_dir_exists(sanitize_path(encodeFilename(filename))):
return
if self.params.get('writedescription', False):
@ -1758,29 +1763,30 @@ class YoutubeDL(object):
ie = self.get_info_extractor(info_dict['extractor_key'])
for sub_lang, sub_info in subtitles.items():
sub_format = sub_info['ext']
if sub_info.get('data') is not None:
sub_data = sub_info['data']
sub_filename = subtitles_filename(filename, sub_lang, sub_format)
if self.params.get('nooverwrites', False) and os.path.exists(encodeFilename(sub_filename)):
self.to_screen('[info] Video subtitle %s.%s is already present' % (sub_lang, sub_format))
else:
try:
sub_data = ie._download_webpage(
sub_info['url'], info_dict['id'], note=False)
except ExtractorError as err:
self.report_warning('Unable to download subtitle for "%s": %s' %
(sub_lang, error_to_compat_str(err.cause)))
continue
try:
sub_filename = subtitles_filename(filename, sub_lang, sub_format)
if self.params.get('nooverwrites', False) and os.path.exists(encodeFilename(sub_filename)):
self.to_screen('[info] Video subtitle %s.%s is already_present' % (sub_lang, sub_format))
self.to_screen('[info] Writing video subtitles to: ' + sub_filename)
if sub_info.get('data') is not None:
try:
# Use newline='' to prevent conversion of newline characters
# See https://github.com/rg3/youtube-dl/issues/10268
with io.open(encodeFilename(sub_filename), 'w', encoding='utf-8', newline='') as subfile:
subfile.write(sub_info['data'])
except (OSError, IOError):
self.report_error('Cannot write subtitles file ' + sub_filename)
return
else:
self.to_screen('[info] Writing video subtitles to: ' + sub_filename)
# Use newline='' to prevent conversion of newline characters
# See https://github.com/rg3/youtube-dl/issues/10268
with io.open(encodeFilename(sub_filename), 'w', encoding='utf-8', newline='') as subfile:
subfile.write(sub_data)
except (OSError, IOError):
self.report_error('Cannot write subtitles file ' + sub_filename)
return
try:
sub_data = ie._request_webpage(
sub_info['url'], info_dict['id'], note=False).read()
with io.open(encodeFilename(sub_filename), 'wb') as subfile:
subfile.write(sub_data)
except (ExtractorError, IOError, OSError, ValueError) as err:
self.report_warning('Unable to download subtitle for "%s": %s' %
(sub_lang, error_to_compat_str(err)))
continue
if self.params.get('writeinfojson', False):
infofn = replace_extension(filename, 'info.json', info_dict.get('ext'))
@ -1853,8 +1859,11 @@ class YoutubeDL(object):
for f in requested_formats:
new_info = dict(info_dict)
new_info.update(f)
fname = self.prepare_filename(new_info)
fname = prepend_extension(fname, 'f%s' % f['format_id'], new_info['ext'])
fname = prepend_extension(
self.prepare_filename(new_info),
'f%s' % f['format_id'], new_info['ext'])
if not ensure_dir_exists(fname):
return
downloaded.append(fname)
partial_success = dl(fname, new_info)
success = success and partial_success

View File

@ -7,6 +7,7 @@ import time
from .amp import AMPIE
from .common import InfoExtractor
from .youtube import YoutubeIE
from ..compat import compat_urlparse
@ -108,9 +109,7 @@ class AbcNewsIE(InfoExtractor):
r'window\.abcnvideo\.url\s*=\s*"([^"]+)"', webpage, 'video URL')
full_video_url = compat_urlparse.urljoin(url, video_url)
youtube_url = self._html_search_regex(
r'<iframe[^>]+src="(https://www\.youtube\.com/embed/[^"]+)"',
webpage, 'YouTube URL', default=None)
youtube_url = YoutubeIE._extract_url(webpage)
timestamp = None
date_str = self._html_search_regex(
@ -140,7 +139,7 @@ class AbcNewsIE(InfoExtractor):
}
if youtube_url:
entries = [entry, self.url_result(youtube_url, 'Youtube')]
entries = [entry, self.url_result(youtube_url, ie=YoutubeIE.ie_key())]
return self.playlist_result(entries)
return entry

View File

@ -0,0 +1,53 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..compat import compat_str
from ..utils import (
float_or_none,
try_get,
)
class AliExpressLiveIE(InfoExtractor):
_VALID_URL = r'https?://live\.aliexpress\.com/live/(?P<id>\d+)'
_TEST = {
'url': 'https://live.aliexpress.com/live/2800002704436634',
'md5': 'e729e25d47c5e557f2630eaf99b740a5',
'info_dict': {
'id': '2800002704436634',
'ext': 'mp4',
'title': 'CASIMA7.22',
'thumbnail': r're:http://.*\.jpg',
'uploader': 'CASIMA Official Store',
'timestamp': 1500717600,
'upload_date': '20170722',
},
}
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
data = self._parse_json(
self._search_regex(
r'(?s)runParams\s*=\s*({.+?})\s*;?\s*var',
webpage, 'runParams'),
video_id)
title = data['title']
formats = self._extract_m3u8_formats(
data['replyStreamUrl'], video_id, 'mp4',
entry_protocol='m3u8_native', m3u8_id='hls')
return {
'id': video_id,
'title': title,
'thumbnail': data.get('coverUrl'),
'uploader': try_get(
data, lambda x: x['followBar']['name'], compat_str),
'timestamp': float_or_none(data.get('startTimeLong'), scale=1000),
'formats': formats,
}

View File

@ -3,16 +3,13 @@ from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import (
compat_urlparse,
compat_str,
)
from ..compat import compat_str
from ..utils import (
determine_ext,
extract_attributes,
ExtractorError,
sanitized_Request,
urlencode_postdata,
urljoin,
)
@ -21,6 +18,8 @@ class AnimeOnDemandIE(InfoExtractor):
_LOGIN_URL = 'https://www.anime-on-demand.de/users/sign_in'
_APPLY_HTML5_URL = 'https://www.anime-on-demand.de/html5apply'
_NETRC_MACHINE = 'animeondemand'
# German-speaking countries of Europe
_GEO_COUNTRIES = ['AT', 'CH', 'DE', 'LI', 'LU']
_TESTS = [{
# jap, OmU
'url': 'https://www.anime-on-demand.de/anime/161',
@ -46,6 +45,10 @@ class AnimeOnDemandIE(InfoExtractor):
# Full length film, non-series, ger/jap, Dub/OmU, account required
'url': 'https://www.anime-on-demand.de/anime/185',
'only_matching': True,
}, {
# Flash videos
'url': 'https://www.anime-on-demand.de/anime/12',
'only_matching': True,
}]
def _login(self):
@ -72,14 +75,13 @@ class AnimeOnDemandIE(InfoExtractor):
'post url', default=self._LOGIN_URL, group='url')
if not post_url.startswith('http'):
post_url = compat_urlparse.urljoin(self._LOGIN_URL, post_url)
request = sanitized_Request(
post_url, urlencode_postdata(login_form))
request.add_header('Referer', self._LOGIN_URL)
post_url = urljoin(self._LOGIN_URL, post_url)
response = self._download_webpage(
request, None, 'Logging in as %s' % username)
post_url, None, 'Logging in as %s' % username,
data=urlencode_postdata(login_form), headers={
'Referer': self._LOGIN_URL,
})
if all(p not in response for p in ('>Logout<', 'href="/users/sign_out"')):
error = self._search_regex(
@ -120,10 +122,11 @@ class AnimeOnDemandIE(InfoExtractor):
formats = []
for input_ in re.findall(
r'<input[^>]+class=["\'].*?streamstarter_html5[^>]+>', html):
r'<input[^>]+class=["\'].*?streamstarter[^>]+>', html):
attributes = extract_attributes(input_)
title = attributes.get('data-dialog-header')
playlist_urls = []
for playlist_key in ('data-playlist', 'data-otherplaylist'):
for playlist_key in ('data-playlist', 'data-otherplaylist', 'data-stream'):
playlist_url = attributes.get(playlist_key)
if isinstance(playlist_url, compat_str) and re.match(
r'/?[\da-zA-Z]+', playlist_url):
@ -147,19 +150,38 @@ class AnimeOnDemandIE(InfoExtractor):
format_id_list.append(compat_str(num))
format_id = '-'.join(format_id_list)
format_note = ', '.join(filter(None, (kind, lang_note)))
request = sanitized_Request(
compat_urlparse.urljoin(url, playlist_url),
item_id_list = []
if format_id:
item_id_list.append(format_id)
item_id_list.append('videomaterial')
playlist = self._download_json(
urljoin(url, playlist_url), video_id,
'Downloading %s JSON' % ' '.join(item_id_list),
headers={
'X-Requested-With': 'XMLHttpRequest',
'X-CSRF-Token': csrf_token,
'Referer': url,
'Accept': 'application/json, text/javascript, */*; q=0.01',
})
playlist = self._download_json(
request, video_id, 'Downloading %s playlist JSON' % format_id,
fatal=False)
}, fatal=False)
if not playlist:
continue
stream_url = playlist.get('streamurl')
if stream_url:
rtmp = re.search(
r'^(?P<url>rtmpe?://(?P<host>[^/]+)/(?P<app>.+/))(?P<playpath>mp[34]:.+)',
stream_url)
if rtmp:
formats.append({
'url': rtmp.group('url'),
'app': rtmp.group('app'),
'play_path': rtmp.group('playpath'),
'page_url': url,
'player_url': 'https://www.anime-on-demand.de/assets/jwplayer.flash-55abfb34080700304d49125ce9ffb4a6.swf',
'rtmp_real_time': True,
'format_id': 'rtmp',
'ext': 'flv',
})
continue
start_video = playlist.get('startvideo', 0)
playlist = playlist.get('playlist')
if not playlist or not isinstance(playlist, list):
@ -222,7 +244,7 @@ class AnimeOnDemandIE(InfoExtractor):
f.update({
'id': '%s-%s' % (f['id'], m.group('kind').lower()),
'title': m.group('title'),
'url': compat_urlparse.urljoin(url, m.group('href')),
'url': urljoin(url, m.group('href')),
})
entries.append(f)

View File

@ -82,7 +82,7 @@ class ArteTVBaseIE(InfoExtractor):
vsr = player_info['VSR']
if not vsr and not player_info.get('VRU'):
if not vsr:
raise ExtractorError(
'Video %s is not available' % player_info.get('VID') or video_id,
expected=True)

View File

@ -33,13 +33,18 @@ class BpbIE(InfoExtractor):
title = self._html_search_regex(
r'<h2 class="white">(.*?)</h2>', webpage, 'title')
video_info_dicts = re.findall(
r"({\s*src:\s*'http://film\.bpb\.de/[^}]+})", webpage)
r"({\s*src\s*:\s*'https?://film\.bpb\.de/[^}]+})", webpage)
formats = []
for video_info in video_info_dicts:
video_info = self._parse_json(video_info, video_id, transform_source=js_to_json)
quality = video_info['quality']
video_url = video_info['src']
video_info = self._parse_json(
video_info, video_id, transform_source=js_to_json, fatal=False)
if not video_info:
continue
video_url = video_info.get('src')
if not video_url:
continue
quality = 'high' if '_high' in video_url else 'low'
formats.append({
'url': video_url,
'preference': 10 if quality == 'high' else 0,

View File

@ -5,6 +5,7 @@ import base64
import json
from .common import InfoExtractor
from .youtube import YoutubeIE
from ..utils import (
clean_html,
ExtractorError
@ -70,11 +71,9 @@ class ChilloutzoneIE(InfoExtractor):
# If nativePlatform is None a fallback mechanism is used (i.e. youtube embed)
if native_platform is None:
youtube_url = self._html_search_regex(
r'<iframe.* src="((?:https?:)?//(?:[^.]+\.)?youtube\.com/.+?)"',
webpage, 'fallback video URL', default=None)
if youtube_url is not None:
return self.url_result(youtube_url, ie='Youtube')
youtube_url = YoutubeIE._extract_url(webpage)
if youtube_url:
return self.url_result(youtube_url, ie=YoutubeIE.ie_key())
# Non Fallback: Decide to use native source (e.g. youtube or vimeo) or
# the own CDN

View File

@ -3,6 +3,7 @@ from __future__ import unicode_literals
import re
from .common import InfoExtractor
from .youtube import YoutubeIE
from ..utils import (
parse_iso8601,
str_to_int,
@ -41,11 +42,9 @@ class CrackedIE(InfoExtractor):
webpage = self._download_webpage(url, video_id)
youtube_url = self._search_regex(
r'<iframe[^>]+src="((?:https?:)?//www\.youtube\.com/embed/[^"]+)"',
webpage, 'youtube url', default=None)
youtube_url = YoutubeIE._extract_url(webpage)
if youtube_url:
return self.url_result(youtube_url, 'Youtube')
return self.url_result(youtube_url, ie=YoutubeIE.ie_key())
video_url = self._html_search_regex(
[r'var\s+CK_vidSrc\s*=\s*"([^"]+)"', r'<video\s+src="([^"]+)"'],

View File

@ -45,6 +45,7 @@ from .anvato import AnvatoIE
from .anysex import AnySexIE
from .aol import AolIE
from .allocine import AllocineIE
from .aliexpress import AliExpressLiveIE
from .aparat import AparatIE
from .appleconnect import AppleConnectIE
from .appletrailers import (
@ -563,6 +564,7 @@ from .mangomolo import (
MangomoloVideoIE,
MangomoloLiveIE,
)
from .manyvids import ManyVidsIE
from .matchtv import MatchTVIE
from .mdr import MDRIE
from .mediaset import MediasetIE
@ -897,6 +899,7 @@ from .rutube import (
RutubeEmbedIE,
RutubeMovieIE,
RutubePersonIE,
RutubePlaylistIE,
)
from .rutv import RUTVIE
from .ruutu import RuutuIE

View File

@ -3,56 +3,99 @@ from __future__ import unicode_literals
from .adobepass import AdobePassIE
from ..utils import (
smuggle_url,
update_url_query,
int_or_none,
parse_age_limit,
parse_duration,
try_get,
unified_timestamp,
)
class FOXIE(AdobePassIE):
_VALID_URL = r'https?://(?:www\.)?fox\.com/watch/(?P<id>[0-9]+)'
_TEST = {
'url': 'http://www.fox.com/watch/255180355939/7684182528',
_VALID_URL = r'https?://(?:www\.)?fox\.com/watch/(?P<id>[\da-fA-F]+)'
_TESTS = [{
# clip
'url': 'https://www.fox.com/watch/4b765a60490325103ea69888fb2bd4e8/',
'md5': 'ebd296fcc41dd4b19f8115d8461a3165',
'info_dict': {
'id': '255180355939',
'id': '4b765a60490325103ea69888fb2bd4e8',
'ext': 'mp4',
'title': 'Official Trailer: Gotham',
'description': 'Tracing the rise of the great DC Comics Super-Villains and vigilantes, Gotham reveals an entirely new chapter that has never been told.',
'duration': 129,
'timestamp': 1400020798,
'upload_date': '20140513',
'uploader': 'NEWA-FNG-FOXCOM',
'title': 'Aftermath: Bruce Wayne Develops Into The Dark Knight',
'description': 'md5:549cd9c70d413adb32ce2a779b53b486',
'duration': 102,
'timestamp': 1504291893,
'upload_date': '20170901',
'creator': 'FOX',
'series': 'Gotham',
},
'add_ie': ['ThePlatform'],
}
'params': {
'skip_download': True,
},
}, {
# episode, geo-restricted
'url': 'https://www.fox.com/watch/087036ca7f33c8eb79b08152b4dd75c1/',
'only_matching': True,
}, {
# episode, geo-restricted, tv provided required
'url': 'https://www.fox.com/watch/30056b295fb57f7452aeeb4920bc3024/',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
settings = self._parse_json(self._search_regex(
r'jQuery\.extend\(Drupal\.settings\s*,\s*({.+?})\);',
webpage, 'drupal settings'), video_id)
fox_pdk_player = settings['fox_pdk_player']
release_url = fox_pdk_player['release_url']
query = {
'mbr': 'true',
'switch': 'http'
}
if fox_pdk_player.get('access') == 'locked':
ap_p = settings['foxAdobePassProvider']
rating = ap_p.get('videoRating')
if rating == 'n/a':
rating = None
resource = self._get_mvpd_resource('fbc-fox', None, ap_p['videoGUID'], rating)
query['auth'] = self._extract_mvpd_auth(url, video_id, 'fbc-fox', resource)
video = self._download_json(
'https://api.fox.com/fbc-content/v1_4/video/%s' % video_id,
video_id, headers={
'apikey': 'abdcbed02c124d393b39e818a4312055',
'Content-Type': 'application/json',
'Referer': url,
})
info = self._search_json_ld(webpage, video_id, fatal=False)
info.update({
'_type': 'url_transparent',
'ie_key': 'ThePlatform',
'url': smuggle_url(update_url_query(release_url, query), {'force_smil_url': True}),
title = video['name']
m3u8_url = self._download_json(
video['videoRelease']['url'], video_id)['playURL']
formats = self._extract_m3u8_formats(
m3u8_url, video_id, 'mp4',
entry_protocol='m3u8_native', m3u8_id='hls')
self._sort_formats(formats)
description = video.get('description')
duration = int_or_none(video.get('durationInSeconds')) or int_or_none(
video.get('duration')) or parse_duration(video.get('duration'))
timestamp = unified_timestamp(video.get('datePublished'))
age_limit = parse_age_limit(video.get('contentRating'))
data = try_get(
video, lambda x: x['trackingData']['properties'], dict) or {}
creator = data.get('brand') or data.get('network') or video.get('network')
series = video.get('seriesName') or data.get(
'seriesName') or data.get('show')
season_number = int_or_none(video.get('seasonNumber'))
episode = video.get('name')
episode_number = int_or_none(video.get('episodeNumber'))
release_year = int_or_none(video.get('releaseYear'))
if data.get('authRequired'):
# TODO: AP
pass
return {
'id': video_id,
})
return info
'title': title,
'description': description,
'duration': duration,
'timestamp': timestamp,
'age_limit': age_limit,
'creator': creator,
'series': series,
'season_number': season_number,
'episode': episode,
'episode_number': episode_number,
'release_year': release_year,
'formats': formats,
}

View File

@ -2243,36 +2243,11 @@ class GenericIE(InfoExtractor):
if vid_me_embed_url is not None:
return self.url_result(vid_me_embed_url, 'Vidme')
# Look for embedded YouTube player
matches = re.findall(r'''(?x)
(?:
<iframe[^>]+?src=|
data-video-url=|
<embed[^>]+?src=|
embedSWF\(?:\s*|
<object[^>]+data=|
new\s+SWFObject\(
)
(["\'])
(?P<url>(?:https?:)?//(?:www\.)?youtube(?:-nocookie)?\.com/
(?:embed|v|p)/.+?)
\1''', webpage)
if matches:
# Look for YouTube embeds
youtube_urls = YoutubeIE._extract_urls(webpage)
if youtube_urls:
return self.playlist_from_matches(
matches, video_id, video_title, lambda m: unescapeHTML(m[1]))
# Look for lazyYT YouTube embed
matches = re.findall(
r'class="lazyYT" data-youtube-id="([^"]+)"', webpage)
if matches:
return self.playlist_from_matches(matches, video_id, video_title, lambda m: unescapeHTML(m))
# Look for Wordpress "YouTube Video Importer" plugin
matches = re.findall(r'''(?x)<div[^>]+
class=(?P<q1>[\'"])[^\'"]*\byvii_single_video_player\b[^\'"]*(?P=q1)[^>]+
data-video_id=(?P<q2>[\'"])([^\'"]+)(?P=q2)''', webpage)
if matches:
return self.playlist_from_matches(matches, video_id, video_title, lambda m: m[-1])
youtube_urls, video_id, video_title, ie=YoutubeIE.ie_key())
matches = DailymotionIE._extract_urls(webpage)
if matches:

View File

@ -0,0 +1,48 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import int_or_none
class ManyVidsIE(InfoExtractor):
_VALID_URL = r'(?i)https?://(?:www\.)?manyvids\.com/video/(?P<id>\d+)'
_TEST = {
'url': 'https://www.manyvids.com/Video/133957/everthing-about-me/',
'md5': '03f11bb21c52dd12a05be21a5c7dcc97',
'info_dict': {
'id': '133957',
'ext': 'mp4',
'title': 'everthing about me (Preview)',
'view_count': int,
'like_count': int,
},
}
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
video_url = self._search_regex(
r'data-(?:video-filepath|meta-video)\s*=s*(["\'])(?P<url>(?:(?!\1).)+)\1',
webpage, 'video URL', group='url')
title = '%s (Preview)' % self._html_search_regex(
r'<h2[^>]+class="m-a-0"[^>]*>([^<]+)', webpage, 'title')
like_count = int_or_none(self._search_regex(
r'data-likes=["\'](\d+)', webpage, 'like count', default=None))
view_count = int_or_none(self._html_search_regex(
r'(?s)<span[^>]+class="views-wrapper"[^>]*>(.+?)</span', webpage,
'view count', default=None))
return {
'id': video_id,
'title': title,
'view_count': view_count,
'like_count': like_count,
'formats': [{
'url': video_url,
}],
}

View File

@ -20,20 +20,37 @@ from ..utils import (
class RadioCanadaIE(InfoExtractor):
IE_NAME = 'radiocanada'
_VALID_URL = r'(?:radiocanada:|https?://ici\.radio-canada\.ca/widgets/mediaconsole/)(?P<app_code>[^:/]+)[:/](?P<id>[0-9]+)'
_TEST = {
'url': 'http://ici.radio-canada.ca/widgets/mediaconsole/medianet/7184272',
'info_dict': {
'id': '7184272',
'ext': 'mp4',
'title': 'Le parcours du tireur capté sur vidéo',
'description': 'Images des caméras de surveillance fournies par la GRC montrant le parcours du tireur d\'Ottawa',
'upload_date': '20141023',
_TESTS = [
{
'url': 'http://ici.radio-canada.ca/widgets/mediaconsole/medianet/7184272',
'info_dict': {
'id': '7184272',
'ext': 'mp4',
'title': 'Le parcours du tireur capté sur vidéo',
'description': 'Images des caméras de surveillance fournies par la GRC montrant le parcours du tireur d\'Ottawa',
'upload_date': '20141023',
},
'params': {
# m3u8 download
'skip_download': True,
}
},
'params': {
# m3u8 download
'skip_download': True,
},
}
{
# empty Title
'url': 'http://ici.radio-canada.ca/widgets/mediaconsole/medianet/7754998/',
'info_dict': {
'id': '7754998',
'ext': 'mp4',
'title': 'letelejournal22h',
'description': 'INTEGRALE WEB 22H-TJ',
'upload_date': '20170720',
},
'params': {
# m3u8 download
'skip_download': True,
},
}
]
def _real_extract(self, url):
url, smuggled_data = unsmuggle_url(url, {})
@ -145,7 +162,7 @@ class RadioCanadaIE(InfoExtractor):
return {
'id': video_id,
'title': get_meta('Title'),
'title': get_meta('Title') or get_meta('AV-nomEmission'),
'description': get_meta('Description') or get_meta('ShortDescription'),
'thumbnail': get_meta('imageHR') or get_meta('imageMR') or get_meta('imageBR'),
'duration': int_or_none(get_meta('length')),

View File

@ -3,6 +3,7 @@ from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import compat_str
from ..utils import (
ExtractorError,
int_or_none,
@ -62,7 +63,23 @@ class RedTubeIE(InfoExtractor):
'format_id': format_id,
'height': int_or_none(format_id),
})
else:
medias = self._parse_json(
self._search_regex(
r'mediaDefinition\s*:\s*(\[.+?\])', webpage,
'media definitions', default='{}'),
video_id, fatal=False)
if medias and isinstance(medias, list):
for media in medias:
format_url = media.get('videoUrl')
if not format_url or not isinstance(format_url, compat_str):
continue
format_id = media.get('quality')
formats.append({
'url': format_url,
'format_id': format_id,
'height': int_or_none(format_id),
})
if not formats:
video_url = self._html_search_regex(
r'<source src="(.+?)" type="video/mp4">', webpage, 'video URL')
formats.append({'url': video_url})
@ -73,7 +90,7 @@ class RedTubeIE(InfoExtractor):
r'<span[^>]+class="added-time"[^>]*>ADDED ([^<]+)<',
webpage, 'upload date', fatal=False))
duration = int_or_none(self._search_regex(
r'videoDuration\s*:\s*(\d+)', webpage, 'duration', fatal=False))
r'videoDuration\s*:\s*(\d+)', webpage, 'duration', default=None))
view_count = str_to_int(self._search_regex(
r'<span[^>]*>VIEWS</span></td>\s*<td>([\d,.]+)',
webpage, 'view count', fatal=False))

View File

@ -7,43 +7,84 @@ import itertools
from .common import InfoExtractor
from ..compat import (
compat_str,
compat_parse_qs,
compat_urllib_parse_urlparse,
)
from ..utils import (
determine_ext,
unified_strdate,
bool_or_none,
int_or_none,
try_get,
unified_timestamp,
)
class RutubeIE(InfoExtractor):
class RutubeBaseIE(InfoExtractor):
def _extract_video(self, video, video_id=None, require_title=True):
title = video['title'] if require_title else video.get('title')
age_limit = video.get('is_adult')
if age_limit is not None:
age_limit = 18 if age_limit is True else 0
uploader_id = try_get(video, lambda x: x['author']['id'])
category = try_get(video, lambda x: x['category']['name'])
return {
'id': video.get('id') or video_id,
'title': title,
'description': video.get('description'),
'thumbnail': video.get('thumbnail_url'),
'duration': int_or_none(video.get('duration')),
'uploader': try_get(video, lambda x: x['author']['name']),
'uploader_id': compat_str(uploader_id) if uploader_id else None,
'timestamp': unified_timestamp(video.get('created_ts')),
'category': [category] if category else None,
'age_limit': age_limit,
'view_count': int_or_none(video.get('hits')),
'comment_count': int_or_none(video.get('comments_count')),
'is_live': bool_or_none(video.get('is_livestream')),
}
class RutubeIE(RutubeBaseIE):
IE_NAME = 'rutube'
IE_DESC = 'Rutube videos'
_VALID_URL = r'https?://rutube\.ru/(?:video|(?:play/)?embed)/(?P<id>[\da-z]{32})'
_TESTS = [{
'url': 'http://rutube.ru/video/3eac3b4561676c17df9132a9a1e62e3e/',
'md5': '79938ade01294ef7e27574890d0d3769',
'info_dict': {
'id': '3eac3b4561676c17df9132a9a1e62e3e',
'ext': 'mp4',
'ext': 'flv',
'title': 'Раненный кенгуру забежал в аптеку',
'description': 'http://www.ntdtv.ru ',
'duration': 80,
'uploader': 'NTDRussian',
'uploader_id': '29790',
'timestamp': 1381943602,
'upload_date': '20131016',
'age_limit': 0,
},
'params': {
# It requires ffmpeg (m3u8 download)
'skip_download': True,
},
}, {
'url': 'http://rutube.ru/play/embed/a10e53b86e8f349080f718582ce4c661',
'only_matching': True,
}, {
'url': 'http://rutube.ru/embed/a10e53b86e8f349080f718582ce4c661',
'only_matching': True,
}, {
'url': 'http://rutube.ru/video/3eac3b4561676c17df9132a9a1e62e3e/?pl_id=4252',
'only_matching': True,
}, {
'url': 'https://rutube.ru/video/10b3a03fc01d5bbcc632a2f3514e8aab/?pl_type=source',
'only_matching': True,
}]
@classmethod
def suitable(cls, url):
return False if RutubePlaylistIE.suitable(url) else super(RutubeIE, cls).suitable(url)
@staticmethod
def _extract_urls(webpage):
return [mobj.group('url') for mobj in re.finditer(
@ -52,12 +93,12 @@ class RutubeIE(InfoExtractor):
def _real_extract(self, url):
video_id = self._match_id(url)
video = self._download_json(
'http://rutube.ru/api/video/%s/?format=json' % video_id,
video_id, 'Downloading video JSON')
# Some videos don't have the author field
author = video.get('author') or {}
info = self._extract_video(video, video_id)
options = self._download_json(
'http://rutube.ru/api/play/options/%s/?format=json' % video_id,
@ -79,19 +120,8 @@ class RutubeIE(InfoExtractor):
})
self._sort_formats(formats)
return {
'id': video['id'],
'title': video['title'],
'description': video['description'],
'duration': video['duration'],
'view_count': video['hits'],
'formats': formats,
'thumbnail': video['thumbnail_url'],
'uploader': author.get('name'),
'uploader_id': compat_str(author['id']) if author else None,
'upload_date': unified_strdate(video['created_ts']),
'age_limit': 18 if video['is_adult'] else 0,
}
info['formats'] = formats
return info
class RutubeEmbedIE(InfoExtractor):
@ -103,7 +133,8 @@ class RutubeEmbedIE(InfoExtractor):
'url': 'http://rutube.ru/video/embed/6722881?vk_puid37=&vk_puid38=',
'info_dict': {
'id': 'a10e53b86e8f349080f718582ce4c661',
'ext': 'mp4',
'ext': 'flv',
'timestamp': 1387830582,
'upload_date': '20131223',
'uploader_id': '297833',
'description': 'Видео группы ★http://vk.com/foxkidsreset★ музей Fox Kids и Jetix<br/><br/> восстановлено и сделано в шикоформате subziro89 http://vk.com/subziro89',
@ -111,7 +142,7 @@ class RutubeEmbedIE(InfoExtractor):
'title': 'Мистический городок Эйри в Индиан 5 серия озвучка subziro89',
},
'params': {
'skip_download': 'Requires ffmpeg',
'skip_download': True,
},
}, {
'url': 'http://rutube.ru/play/embed/8083783',
@ -125,10 +156,51 @@ class RutubeEmbedIE(InfoExtractor):
canonical_url = self._html_search_regex(
r'<link\s+rel="canonical"\s+href="([^"]+?)"', webpage,
'Canonical URL')
return self.url_result(canonical_url, 'Rutube')
return self.url_result(canonical_url, RutubeIE.ie_key())
class RutubeChannelIE(InfoExtractor):
class RutubePlaylistBaseIE(RutubeBaseIE):
def _next_page_url(self, page_num, playlist_id, *args, **kwargs):
return self._PAGE_TEMPLATE % (playlist_id, page_num)
def _entries(self, playlist_id, *args, **kwargs):
next_page_url = None
for pagenum in itertools.count(1):
page = self._download_json(
next_page_url or self._next_page_url(
pagenum, playlist_id, *args, **kwargs),
playlist_id, 'Downloading page %s' % pagenum)
results = page.get('results')
if not results or not isinstance(results, list):
break
for result in results:
video_url = result.get('video_url')
if not video_url or not isinstance(video_url, compat_str):
continue
entry = self._extract_video(result, require_title=False)
entry.update({
'_type': 'url',
'url': video_url,
'ie_key': RutubeIE.ie_key(),
})
yield entry
next_page_url = page.get('next')
if not next_page_url or not page.get('has_next'):
break
def _extract_playlist(self, playlist_id, *args, **kwargs):
return self.playlist_result(
self._entries(playlist_id, *args, **kwargs),
playlist_id, kwargs.get('playlist_name'))
def _real_extract(self, url):
return self._extract_playlist(self._match_id(url))
class RutubeChannelIE(RutubePlaylistBaseIE):
IE_NAME = 'rutube:channel'
IE_DESC = 'Rutube channels'
_VALID_URL = r'https?://rutube\.ru/tags/video/(?P<id>\d+)'
@ -142,27 +214,8 @@ class RutubeChannelIE(InfoExtractor):
_PAGE_TEMPLATE = 'http://rutube.ru/api/tags/video/%s/?page=%s&format=json'
def _extract_videos(self, channel_id, channel_title=None):
entries = []
for pagenum in itertools.count(1):
page = self._download_json(
self._PAGE_TEMPLATE % (channel_id, pagenum),
channel_id, 'Downloading page %s' % pagenum)
results = page['results']
if not results:
break
entries.extend(self.url_result(result['video_url'], 'Rutube') for result in results)
if not page['has_next']:
break
return self.playlist_result(entries, channel_id, channel_title)
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
channel_id = mobj.group('id')
return self._extract_videos(channel_id)
class RutubeMovieIE(RutubeChannelIE):
class RutubeMovieIE(RutubePlaylistBaseIE):
IE_NAME = 'rutube:movie'
IE_DESC = 'Rutube movies'
_VALID_URL = r'https?://rutube\.ru/metainfo/tv/(?P<id>\d+)'
@ -176,11 +229,11 @@ class RutubeMovieIE(RutubeChannelIE):
movie = self._download_json(
self._MOVIE_TEMPLATE % movie_id, movie_id,
'Downloading movie JSON')
movie_name = movie['name']
return self._extract_videos(movie_id, movie_name)
return self._extract_playlist(
movie_id, playlist_name=movie.get('name'))
class RutubePersonIE(RutubeChannelIE):
class RutubePersonIE(RutubePlaylistBaseIE):
IE_NAME = 'rutube:person'
IE_DESC = 'Rutube person videos'
_VALID_URL = r'https?://rutube\.ru/video/person/(?P<id>\d+)'
@ -193,3 +246,37 @@ class RutubePersonIE(RutubeChannelIE):
}]
_PAGE_TEMPLATE = 'http://rutube.ru/api/video/person/%s/?page=%s&format=json'
class RutubePlaylistIE(RutubePlaylistBaseIE):
IE_NAME = 'rutube:playlist'
IE_DESC = 'Rutube playlists'
_VALID_URL = r'https?://rutube\.ru/(?:video|(?:play/)?embed)/[\da-z]{32}/\?.*?\bpl_id=(?P<id>\d+)'
_TESTS = [{
'url': 'https://rutube.ru/video/cecd58ed7d531fc0f3d795d51cee9026/?pl_id=3097&pl_type=tag',
'info_dict': {
'id': '3097',
},
'playlist_count': 27,
}, {
'url': 'https://rutube.ru/video/10b3a03fc01d5bbcc632a2f3514e8aab/?pl_id=4252&pl_type=source',
'only_matching': True,
}]
_PAGE_TEMPLATE = 'http://rutube.ru/api/playlist/%s/%s/?page=%s&format=json'
@classmethod
def suitable(cls, url):
if not super(RutubePlaylistIE, cls).suitable(url):
return False
params = compat_parse_qs(compat_urllib_parse_urlparse(url).query)
return params.get('pl_type', [None])[0] and int_or_none(params.get('pl_id', [None])[0])
def _next_page_url(self, page_num, playlist_id, item_kind):
return self._PAGE_TEMPLATE % (item_kind, playlist_id, page_num)
def _real_extract(self, url):
qs = compat_parse_qs(compat_urllib_parse_urlparse(url).query)
playlist_kind = qs['pl_type'][0]
playlist_id = qs['pl_id'][0]
return self._extract_playlist(playlist_id, item_kind=playlist_kind)

View File

@ -1,8 +1,8 @@
# coding: utf-8
from __future__ import unicode_literals
import re
import itertools
import re
from .common import (
InfoExtractor,
@ -17,6 +17,7 @@ from ..utils import (
ExtractorError,
int_or_none,
unified_strdate,
update_url_query,
)
@ -120,6 +121,21 @@ class SoundcloudIE(InfoExtractor):
'license': 'cc-by-sa',
},
},
# private link, downloadable format
{
'url': 'https://soundcloud.com/oriuplift/uponly-238-no-talking-wav/s-AyZUd',
'md5': '64a60b16e617d41d0bef032b7f55441e',
'info_dict': {
'id': '340344461',
'ext': 'wav',
'title': 'Uplifting Only 238 [No Talking] (incl. Alex Feed Guestmix) (Aug 31, 2017) [wav]',
'description': 'md5:fa20ee0fca76a3d6df8c7e57f3715366',
'uploader': 'Ori Uplift Music',
'upload_date': '20170831',
'duration': 7449,
'license': 'all-rights-reserved',
},
},
]
_CLIENT_ID = 'JlZIsxg2hY5WnBgtn3jfS0UYCl0K8DOg'
@ -160,11 +176,13 @@ class SoundcloudIE(InfoExtractor):
'license': info.get('license'),
}
formats = []
query = {'client_id': self._CLIENT_ID}
if secret_token is not None:
query['secret_token'] = secret_token
if info.get('downloadable', False):
# We can build a direct link to the song
format_url = (
'https://api.soundcloud.com/tracks/{0}/download?client_id={1}'.format(
track_id, self._CLIENT_ID))
format_url = update_url_query(
'https://api.soundcloud.com/tracks/%s/download' % track_id, query)
formats.append({
'format_id': 'download',
'ext': info.get('original_format', 'mp3'),
@ -176,10 +194,7 @@ class SoundcloudIE(InfoExtractor):
# We have to retrieve the url
format_dict = self._download_json(
'https://api.soundcloud.com/i1/tracks/%s/streams' % track_id,
track_id, 'Downloading track url', query={
'client_id': self._CLIENT_ID,
'secret_token': secret_token,
})
track_id, 'Downloading track url', query=query)
for key, stream_url in format_dict.items():
abr = int_or_none(self._search_regex(
@ -216,7 +231,7 @@ class SoundcloudIE(InfoExtractor):
# cannot be always used, sometimes it can give an HTTP 404 error
formats.append({
'format_id': 'fallback',
'url': info['stream_url'] + '?client_id=' + self._CLIENT_ID,
'url': update_url_query(info['stream_url'], query),
'ext': ext,
})

View File

@ -7,6 +7,7 @@ import hashlib
import json
from .adobepass import AdobePassIE
from .youtube import YoutubeIE
from .common import InfoExtractor
from ..compat import compat_HTTPError
from ..utils import (
@ -261,11 +262,9 @@ class ViceArticleIE(InfoExtractor):
if embed_code:
return _url_res('ooyala:%s' % embed_code, 'Ooyala')
youtube_url = self._html_search_regex(
r'<iframe[^>]+src="(.*youtube\.com/.*)"',
body, 'YouTube URL', default=None)
youtube_url = YoutubeIE._extract_url(body)
if youtube_url:
return _url_res(youtube_url, 'Youtube')
return _url_res(youtube_url, YoutubeIE.ie_key())
video_url = self._html_search_regex(
r'data-video-url="([^"]+)"',

View File

@ -263,29 +263,35 @@ class VidmeListBaseIE(InfoExtractor):
class VidmeUserIE(VidmeListBaseIE):
IE_NAME = 'vidme:user'
_VALID_URL = r'https?://vid\.me/(?:e/)?(?P<id>[\da-zA-Z]{6,})(?!/likes)(?:[^\da-zA-Z]|$)'
_VALID_URL = r'https?://vid\.me/(?:e/)?(?P<id>[\da-zA-Z_-]{6,})(?!/likes)(?:[^\da-zA-Z_-]|$)'
_API_ITEM = 'list'
_TITLE = 'Videos'
_TEST = {
'url': 'https://vid.me/EFARCHIVE',
_TESTS = [{
'url': 'https://vid.me/MasakoX',
'info_dict': {
'id': '3834632',
'title': 'EFARCHIVE - %s' % _TITLE,
'id': '16112341',
'title': 'MasakoX - %s' % _TITLE,
},
'playlist_mincount': 238,
}
'playlist_mincount': 191,
}, {
'url': 'https://vid.me/unsQuare_netWork',
'only_matching': True,
}]
class VidmeUserLikesIE(VidmeListBaseIE):
IE_NAME = 'vidme:user:likes'
_VALID_URL = r'https?://vid\.me/(?:e/)?(?P<id>[\da-zA-Z]{6,})/likes'
_VALID_URL = r'https?://vid\.me/(?:e/)?(?P<id>[\da-zA-Z_-]{6,})/likes'
_API_ITEM = 'likes'
_TITLE = 'Likes'
_TEST = {
_TESTS = [{
'url': 'https://vid.me/ErinAlexis/likes',
'info_dict': {
'id': '6483530',
'title': 'ErinAlexis - %s' % _TITLE,
},
'playlist_mincount': 415,
}
}, {
'url': 'https://vid.me/Kaleidoscope-Ish/likes',
'only_matching': True,
}]

View File

@ -25,6 +25,7 @@ from ..utils import (
from .dailymotion import DailymotionIE
from .pladform import PladformIE
from .vimeo import VimeoIE
from .youtube import YoutubeIE
class VKBaseIE(InfoExtractor):
@ -345,11 +346,9 @@ class VKIE(VKBaseIE):
if re.search(error_re, info_page):
raise ExtractorError(error_msg % video_id, expected=True)
youtube_url = self._search_regex(
r'<iframe[^>]+src="((?:https?:)?//www.youtube.com/embed/[^"]+)"',
info_page, 'youtube iframe', default=None)
youtube_url = YoutubeIE._extract_url(info_page)
if youtube_url:
return self.url_result(youtube_url, 'Youtube')
return self.url_result(youtube_url, ie=YoutubeIE.ie_key())
vimeo_url = VimeoIE._extract_url(url, info_page)
if vimeo_url is not None:

View File

@ -1374,6 +1374,43 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
playback_url, video_id, 'Marking watched',
'Unable to mark watched', fatal=False)
@staticmethod
def _extract_urls(webpage):
# Embedded YouTube player
entries = [
unescapeHTML(mobj.group('url'))
for mobj in re.finditer(r'''(?x)
(?:
<iframe[^>]+?src=|
data-video-url=|
<embed[^>]+?src=|
embedSWF\(?:\s*|
<object[^>]+data=|
new\s+SWFObject\(
)
(["\'])
(?P<url>(?:https?:)?//(?:www\.)?youtube(?:-nocookie)?\.com/
(?:embed|v|p)/.+?)
\1''', webpage)]
# lazyYT YouTube embed
entries.extend(list(map(
unescapeHTML,
re.findall(r'class="lazyYT" data-youtube-id="([^"]+)"', webpage))))
# Wordpress "YouTube Video Importer" plugin
matches = re.findall(r'''(?x)<div[^>]+
class=(?P<q1>[\'"])[^\'"]*\byvii_single_video_player\b[^\'"]*(?P=q1)[^>]+
data-video_id=(?P<q2>[\'"])([^\'"]+)(?P=q2)''', webpage)
entries.extend(m[-1] for m in matches)
return entries
@staticmethod
def _extract_url(webpage):
urls = YoutubeIE._extract_urls(webpage)
return urls[0] if urls else None
@classmethod
def extract_id(cls, url):
mobj = re.match(cls._VALID_URL, url, re.VERBOSE)

View File

@ -1815,6 +1815,10 @@ def float_or_none(v, scale=1, invscale=1, default=None):
return default
def bool_or_none(v, default=None):
return v if isinstance(v, bool) else default
def strip_or_none(v):
return None if v is None else v.strip()

View File

@ -1,3 +1,3 @@
from __future__ import unicode_literals
__version__ = '2017.09.02'
__version__ = '2017.09.11'