mirror of
https://github.com/l1ving/youtube-dl
synced 2025-03-28 10:11:29 +08:00
Merge branch 'master' of https://github.com/rg3/youtube-dl into multipart_videos
Conflicts: youtube_dl/YoutubeDL.py youtube_dl/__init__.py youtube_dl/extractor/common.py youtube_dl/extractor/mtv.py youtube_dl/extractor/tudou.py youtube_dl/postprocessor/__init__.py youtube_dl/postprocessor/ffmpeg.py
This commit is contained in:
commit
b01fa6bdbc
2
.gitignore
vendored
2
.gitignore
vendored
@ -31,3 +31,5 @@ updates_key.pem
|
|||||||
test/testdata
|
test/testdata
|
||||||
.tox
|
.tox
|
||||||
youtube-dl.zsh
|
youtube-dl.zsh
|
||||||
|
.idea
|
||||||
|
.idea/*
|
@ -4,12 +4,14 @@ python:
|
|||||||
- "2.7"
|
- "2.7"
|
||||||
- "3.3"
|
- "3.3"
|
||||||
- "3.4"
|
- "3.4"
|
||||||
|
before_install:
|
||||||
|
- sudo apt-get update -qq
|
||||||
|
- sudo apt-get install -yqq rtmpdump
|
||||||
script: nosetests test --verbose
|
script: nosetests test --verbose
|
||||||
notifications:
|
notifications:
|
||||||
email:
|
email:
|
||||||
- filippo.valsorda@gmail.com
|
- filippo.valsorda@gmail.com
|
||||||
- phihag@phihag.de
|
- phihag@phihag.de
|
||||||
- jaime.marquinez.ferrandiz+travis@gmail.com
|
|
||||||
- yasoob.khld@gmail.com
|
- yasoob.khld@gmail.com
|
||||||
# irc:
|
# irc:
|
||||||
# channels:
|
# channels:
|
||||||
|
23
AUTHORS
23
AUTHORS
@ -88,3 +88,26 @@ Dao Hoang Son
|
|||||||
Oskar Jauch
|
Oskar Jauch
|
||||||
Matthew Rayfield
|
Matthew Rayfield
|
||||||
t0mm0
|
t0mm0
|
||||||
|
Tithen-Firion
|
||||||
|
Zack Fernandes
|
||||||
|
cryptonaut
|
||||||
|
Adrian Kretz
|
||||||
|
Mathias Rav
|
||||||
|
Petr Kutalek
|
||||||
|
Will Glynn
|
||||||
|
Max Reimann
|
||||||
|
Cédric Luthi
|
||||||
|
Thijs Vermeir
|
||||||
|
Joel Leclerc
|
||||||
|
Christopher Krooss
|
||||||
|
Ondřej Caletka
|
||||||
|
Dinesh S
|
||||||
|
Johan K. Jensen
|
||||||
|
Yen Chi Hsuan
|
||||||
|
Enam Mijbah Noor
|
||||||
|
David Luhmer
|
||||||
|
Shaya Goldberg
|
||||||
|
Paul Hartmann
|
||||||
|
Frans de Jonge
|
||||||
|
Robin de Rooij
|
||||||
|
Ryan Schmidt
|
||||||
|
138
CONTRIBUTING.md
Normal file
138
CONTRIBUTING.md
Normal file
@ -0,0 +1,138 @@
|
|||||||
|
**Please include the full output of youtube-dl when run with `-v`**.
|
||||||
|
|
||||||
|
The output (including the first lines) contain important debugging information. Issues without the full output are often not reproducible and therefore do not get solved in short order, if ever.
|
||||||
|
|
||||||
|
Please re-read your issue once again to avoid a couple of common mistakes (you can and should use this as a checklist):
|
||||||
|
|
||||||
|
### Is the description of the issue itself sufficient?
|
||||||
|
|
||||||
|
We often get issue reports that we cannot really decipher. While in most cases we eventually get the required information after asking back multiple times, this poses an unnecessary drain on our resources. Many contributors, including myself, are also not native speakers, so we may misread some parts.
|
||||||
|
|
||||||
|
So please elaborate on what feature you are requesting, or what bug you want to be fixed. Make sure that it's obvious
|
||||||
|
|
||||||
|
- What the problem is
|
||||||
|
- How it could be fixed
|
||||||
|
- How your proposed solution would look like
|
||||||
|
|
||||||
|
If your report is shorter than two lines, it is almost certainly missing some of these, which makes it hard for us to respond to it. We're often too polite to close the issue outright, but the missing info makes misinterpretation likely. As a commiter myself, I often get frustrated by these issues, since the only possible way for me to move forward on them is to ask for clarification over and over.
|
||||||
|
|
||||||
|
For bug reports, this means that your report should contain the *complete* output of youtube-dl when called with the -v flag. The error message you get for (most) bugs even says so, but you would not believe how many of our bug reports do not contain this information.
|
||||||
|
|
||||||
|
Site support requests **must contain an example URL**. An example URL is a URL you might want to download, like http://www.youtube.com/watch?v=BaW_jenozKc . There should be an obvious video present. Except under very special circumstances, the main page of a video service (e.g. http://www.youtube.com/ ) is *not* an example URL.
|
||||||
|
|
||||||
|
### Are you using the latest version?
|
||||||
|
|
||||||
|
Before reporting any issue, type youtube-dl -U. This should report that you're up-to-date. About 20% of the reports we receive are already fixed, but people are using outdated versions. This goes for feature requests as well.
|
||||||
|
|
||||||
|
### Is the issue already documented?
|
||||||
|
|
||||||
|
Make sure that someone has not already opened the issue you're trying to open. Search at the top of the window or at https://github.com/rg3/youtube-dl/search?type=Issues . If there is an issue, feel free to write something along the lines of "This affects me as well, with version 2015.01.01. Here is some more information on the issue: ...". While some issues may be old, a new post into them often spurs rapid activity.
|
||||||
|
|
||||||
|
### Why are existing options not enough?
|
||||||
|
|
||||||
|
Before requesting a new feature, please have a quick peek at [the list of supported options](https://github.com/rg3/youtube-dl/blob/master/README.md#synopsis). Many feature requests are for features that actually exist already! Please, absolutely do show off your work in the issue report and detail how the existing similar options do *not* solve your problem.
|
||||||
|
|
||||||
|
### Is there enough context in your bug report?
|
||||||
|
|
||||||
|
People want to solve problems, and often think they do us a favor by breaking down their larger problems (e.g. wanting to skip already downloaded files) to a specific request (e.g. requesting us to look whether the file exists before downloading the info page). However, what often happens is that they break down the problem into two steps: One simple, and one impossible (or extremely complicated one).
|
||||||
|
|
||||||
|
We are then presented with a very complicated request when the original problem could be solved far easier, e.g. by recording the downloaded video IDs in a separate file. To avoid this, you must include the greater context where it is non-obvious. In particular, every feature request that does not consist of adding support for a new site should contain a use case scenario that explains in what situation the missing feature would be useful.
|
||||||
|
|
||||||
|
### Does the issue involve one problem, and one problem only?
|
||||||
|
|
||||||
|
Some of our users seem to think there is a limit of issues they can or should open. There is no limit of issues they can or should open. While it may seem appealing to be able to dump all your issues into one ticket, that means that someone who solves one of your issues cannot mark the issue as closed. Typically, reporting a bunch of issues leads to the ticket lingering since nobody wants to attack that behemoth, until someone mercifully splits the issue into multiple ones.
|
||||||
|
|
||||||
|
In particular, every site support request issue should only pertain to services at one site (generally under a common domain, but always using the same backend technology). Do not request support for vimeo user videos, Whitehouse podcasts, and Google Plus pages in the same issue. Also, make sure that you don't post bug reports alongside feature requests. As a rule of thumb, a feature request does not include outputs of youtube-dl that are not immediately related to the feature at hand. Do not post reports of a network error alongside the request for a new video service.
|
||||||
|
|
||||||
|
### Is anyone going to need the feature?
|
||||||
|
|
||||||
|
Only post features that you (or an incapacitated friend you can personally talk to) require. Do not post features because they seem like a good idea. If they are really useful, they will be requested by someone who requires them.
|
||||||
|
|
||||||
|
### Is your question about youtube-dl?
|
||||||
|
|
||||||
|
It may sound strange, but some bug reports we receive are completely unrelated to youtube-dl and relate to a different or even the reporter's own application. Please make sure that you are actually using youtube-dl. If you are using a UI for youtube-dl, report the bug to the maintainer of the actual application providing the UI. On the other hand, if your UI for youtube-dl fails in some way you believe is related to youtube-dl, by all means, go ahead and report the bug.
|
||||||
|
|
||||||
|
# DEVELOPER INSTRUCTIONS
|
||||||
|
|
||||||
|
Most users do not need to build youtube-dl and can [download the builds](http://rg3.github.io/youtube-dl/download.html) or get them from their distribution.
|
||||||
|
|
||||||
|
To run youtube-dl as a developer, you don't need to build anything either. Simply execute
|
||||||
|
|
||||||
|
python -m youtube_dl
|
||||||
|
|
||||||
|
To run the test, simply invoke your favorite test runner, or execute a test file directly; any of the following work:
|
||||||
|
|
||||||
|
python -m unittest discover
|
||||||
|
python test/test_download.py
|
||||||
|
nosetests
|
||||||
|
|
||||||
|
If you want to create a build of youtube-dl yourself, you'll need
|
||||||
|
|
||||||
|
* python
|
||||||
|
* make
|
||||||
|
* pandoc
|
||||||
|
* zip
|
||||||
|
* nosetests
|
||||||
|
|
||||||
|
### Adding support for a new site
|
||||||
|
|
||||||
|
If you want to add support for a new site, you can follow this quick list (assuming your service is called `yourextractor`):
|
||||||
|
|
||||||
|
1. [Fork this repository](https://github.com/rg3/youtube-dl/fork)
|
||||||
|
2. Check out the source code with `git clone git@github.com:YOUR_GITHUB_USERNAME/youtube-dl.git`
|
||||||
|
3. Start a new git branch with `cd youtube-dl; git checkout -b yourextractor`
|
||||||
|
4. Start with this simple template and save it to `youtube_dl/extractor/yourextractor.py`:
|
||||||
|
```python
|
||||||
|
# coding: utf-8
|
||||||
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
from .common import InfoExtractor
|
||||||
|
|
||||||
|
|
||||||
|
class YourExtractorIE(InfoExtractor):
|
||||||
|
_VALID_URL = r'https?://(?:www\.)?yourextractor\.com/watch/(?P<id>[0-9]+)'
|
||||||
|
_TEST = {
|
||||||
|
'url': 'http://yourextractor.com/watch/42',
|
||||||
|
'md5': 'TODO: md5 sum of the first 10241 bytes of the video file (use --test)',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '42',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'Video title goes here',
|
||||||
|
'thumbnail': 're:^https?://.*\.jpg$',
|
||||||
|
# TODO more properties, either as:
|
||||||
|
# * A value
|
||||||
|
# * MD5 checksum; start the string with md5:
|
||||||
|
# * A regular expression; start the string with re:
|
||||||
|
# * Any Python type (for example int or float)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
video_id = self._match_id(url)
|
||||||
|
webpage = self._download_webpage(url, video_id)
|
||||||
|
|
||||||
|
# TODO more code goes here, for example ...
|
||||||
|
title = self._html_search_regex(r'<h1>(.*?)</h1>', webpage, 'title')
|
||||||
|
|
||||||
|
return {
|
||||||
|
'id': video_id,
|
||||||
|
'title': title,
|
||||||
|
'description': self._og_search_description(webpage),
|
||||||
|
# TODO more properties (see youtube_dl/extractor/common.py)
|
||||||
|
}
|
||||||
|
```
|
||||||
|
5. Add an import in [`youtube_dl/extractor/__init__.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/__init__.py).
|
||||||
|
6. Run `python test/test_download.py TestDownload.test_YourExtractor`. This *should fail* at first, but you can continually re-run it until you're done. If you decide to add more than one test, then rename ``_TEST`` to ``_TESTS`` and make it into a list of dictionaries. The tests will be then be named `TestDownload.test_YourExtractor`, `TestDownload.test_YourExtractor_1`, `TestDownload.test_YourExtractor_2`, etc.
|
||||||
|
7. Have a look at [`youtube_dl/common/extractor/common.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py) for possible helper methods and a [detailed description of what your extractor should return](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py#L38). Add tests and code for as many as you want.
|
||||||
|
8. If you can, check the code with [flake8](https://pypi.python.org/pypi/flake8).
|
||||||
|
9. When the tests pass, [add](http://git-scm.com/docs/git-add) the new files and [commit](http://git-scm.com/docs/git-commit) them and [push](http://git-scm.com/docs/git-push) the result, like this:
|
||||||
|
|
||||||
|
$ git add youtube_dl/extractor/__init__.py
|
||||||
|
$ git add youtube_dl/extractor/yourextractor.py
|
||||||
|
$ git commit -m '[yourextractor] Add new extractor'
|
||||||
|
$ git push origin yourextractor
|
||||||
|
|
||||||
|
10. Finally, [create a pull request](https://help.github.com/articles/creating-a-pull-request). We'll then review and merge it.
|
||||||
|
|
||||||
|
In any case, thank you very much for your contributions!
|
||||||
|
|
26
Makefile
26
Makefile
@ -1,10 +1,7 @@
|
|||||||
all: youtube-dl README.md README.txt youtube-dl.1 youtube-dl.bash-completion youtube-dl.zsh youtube-dl.fish
|
all: youtube-dl README.md CONTRIBUTING.md README.txt youtube-dl.1 youtube-dl.bash-completion youtube-dl.zsh youtube-dl.fish supportedsites
|
||||||
|
|
||||||
clean:
|
clean:
|
||||||
rm -rf youtube-dl.1.temp.md youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz youtube-dl.zsh youtube-dl.fish *.dump *.part
|
rm -rf youtube-dl.1.temp.md youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz youtube-dl.zsh youtube-dl.fish *.dump *.part *.info.json *.mp4 *.flv *.mp3 *.avi CONTRIBUTING.md.tmp youtube-dl youtube-dl.exe
|
||||||
|
|
||||||
cleanall: clean
|
|
||||||
rm -f youtube-dl youtube-dl.exe
|
|
||||||
|
|
||||||
PREFIX ?= /usr/local
|
PREFIX ?= /usr/local
|
||||||
BINDIR ?= $(PREFIX)/bin
|
BINDIR ?= $(PREFIX)/bin
|
||||||
@ -35,13 +32,22 @@ install: youtube-dl youtube-dl.1 youtube-dl.bash-completion youtube-dl.zsh youtu
|
|||||||
install -d $(DESTDIR)$(SYSCONFDIR)/fish/completions
|
install -d $(DESTDIR)$(SYSCONFDIR)/fish/completions
|
||||||
install -m 644 youtube-dl.fish $(DESTDIR)$(SYSCONFDIR)/fish/completions/youtube-dl.fish
|
install -m 644 youtube-dl.fish $(DESTDIR)$(SYSCONFDIR)/fish/completions/youtube-dl.fish
|
||||||
|
|
||||||
|
codetest:
|
||||||
|
flake8 .
|
||||||
|
|
||||||
test:
|
test:
|
||||||
#nosetests --with-coverage --cover-package=youtube_dl --cover-html --verbose --processes 4 test
|
#nosetests --with-coverage --cover-package=youtube_dl --cover-html --verbose --processes 4 test
|
||||||
nosetests --verbose test
|
nosetests --verbose test
|
||||||
|
$(MAKE) codetest
|
||||||
|
|
||||||
|
ot: offlinetest
|
||||||
|
|
||||||
|
offlinetest: codetest
|
||||||
|
nosetests --verbose test --exclude test_download --exclude test_age_restriction --exclude test_subtitles --exclude test_write_annotations --exclude test_youtube_lists
|
||||||
|
|
||||||
tar: youtube-dl.tar.gz
|
tar: youtube-dl.tar.gz
|
||||||
|
|
||||||
.PHONY: all clean install test tar bash-completion pypi-files zsh-completion fish-completion
|
.PHONY: all clean install test tar bash-completion pypi-files zsh-completion fish-completion ot offlinetest codetest supportedsites
|
||||||
|
|
||||||
pypi-files: youtube-dl.bash-completion README.txt youtube-dl.1 youtube-dl.fish
|
pypi-files: youtube-dl.bash-completion README.txt youtube-dl.1 youtube-dl.fish
|
||||||
|
|
||||||
@ -54,7 +60,13 @@ youtube-dl: youtube_dl/*.py youtube_dl/*/*.py
|
|||||||
chmod a+x youtube-dl
|
chmod a+x youtube-dl
|
||||||
|
|
||||||
README.md: youtube_dl/*.py youtube_dl/*/*.py
|
README.md: youtube_dl/*.py youtube_dl/*/*.py
|
||||||
COLUMNS=80 python -m youtube_dl --help | python devscripts/make_readme.py
|
COLUMNS=80 python youtube_dl/__main__.py --help | python devscripts/make_readme.py
|
||||||
|
|
||||||
|
CONTRIBUTING.md: README.md
|
||||||
|
python devscripts/make_contributing.py README.md CONTRIBUTING.md
|
||||||
|
|
||||||
|
supportedsites:
|
||||||
|
python devscripts/make_supportedsites.py docs/supportedsites.md
|
||||||
|
|
||||||
README.txt: README.md
|
README.txt: README.md
|
||||||
pandoc -f markdown -t plain README.md -o README.txt
|
pandoc -f markdown -t plain README.md -o README.txt
|
||||||
|
253
README.md
253
README.md
@ -1,7 +1,15 @@
|
|||||||
youtube-dl - download videos from youtube.com or other video platforms
|
youtube-dl - download videos from youtube.com or other video platforms
|
||||||
|
|
||||||
# SYNOPSIS
|
- [INSTALLATION](#installation)
|
||||||
**youtube-dl** [OPTIONS] URL [URL...]
|
- [DESCRIPTION](#description)
|
||||||
|
- [OPTIONS](#options)
|
||||||
|
- [CONFIGURATION](#configuration)
|
||||||
|
- [OUTPUT TEMPLATE](#output-template)
|
||||||
|
- [VIDEO SELECTION](#video-selection)
|
||||||
|
- [FAQ](#faq)
|
||||||
|
- [DEVELOPER INSTRUCTIONS](#developer-instructions)
|
||||||
|
- [BUGS](#bugs)
|
||||||
|
- [COPYRIGHT](#copyright)
|
||||||
|
|
||||||
# INSTALLATION
|
# INSTALLATION
|
||||||
|
|
||||||
@ -34,6 +42,8 @@ YouTube.com and a few more sites. It requires the Python interpreter, version
|
|||||||
your Unix box, on Windows or on Mac OS X. It is released to the public domain,
|
your Unix box, on Windows or on Mac OS X. It is released to the public domain,
|
||||||
which means you can modify it, redistribute it or use it however you like.
|
which means you can modify it, redistribute it or use it however you like.
|
||||||
|
|
||||||
|
youtube-dl [OPTIONS] URL [URL...]
|
||||||
|
|
||||||
# OPTIONS
|
# OPTIONS
|
||||||
-h, --help print this help text and exit
|
-h, --help print this help text and exit
|
||||||
--version print program version and exit
|
--version print program version and exit
|
||||||
@ -50,10 +60,6 @@ which means you can modify it, redistribute it or use it however you like.
|
|||||||
they would handle
|
they would handle
|
||||||
--extractor-descriptions Output descriptions of all supported
|
--extractor-descriptions Output descriptions of all supported
|
||||||
extractors
|
extractors
|
||||||
--proxy URL Use the specified HTTP/HTTPS proxy. Pass in
|
|
||||||
an empty string (--proxy "") for direct
|
|
||||||
connection
|
|
||||||
--socket-timeout None Time to wait before giving up, in seconds
|
|
||||||
--default-search PREFIX Use this prefix for unqualified URLs. For
|
--default-search PREFIX Use this prefix for unqualified URLs. For
|
||||||
example "gvsearch2:" downloads two videos
|
example "gvsearch2:" downloads two videos
|
||||||
from google videos for youtube-dl "large
|
from google videos for youtube-dl "large
|
||||||
@ -65,16 +71,37 @@ which means you can modify it, redistribute it or use it however you like.
|
|||||||
this is not possible instead of searching.
|
this is not possible instead of searching.
|
||||||
--ignore-config Do not read configuration files. When given
|
--ignore-config Do not read configuration files. When given
|
||||||
in the global configuration file /etc
|
in the global configuration file /etc
|
||||||
/youtube-dl.conf: do not read the user
|
/youtube-dl.conf: Do not read the user
|
||||||
configuration in ~/.config/youtube-dl.conf
|
configuration in ~/.config/youtube-
|
||||||
(%APPDATA%/youtube-dl/config.txt on
|
dl/config (%APPDATA%/youtube-dl/config.txt
|
||||||
Windows)
|
on Windows)
|
||||||
--flat-playlist Do not extract the videos of a playlist,
|
--flat-playlist Do not extract the videos of a playlist,
|
||||||
only list them.
|
only list them.
|
||||||
|
--no-color Do not emit color codes in output.
|
||||||
|
|
||||||
|
## Network Options:
|
||||||
|
--proxy URL Use the specified HTTP/HTTPS proxy. Pass in
|
||||||
|
an empty string (--proxy "") for direct
|
||||||
|
connection
|
||||||
|
--socket-timeout SECONDS Time to wait before giving up, in seconds
|
||||||
|
--source-address IP Client-side IP address to bind to
|
||||||
|
(experimental)
|
||||||
|
-4, --force-ipv4 Make all connections via IPv4
|
||||||
|
(experimental)
|
||||||
|
-6, --force-ipv6 Make all connections via IPv6
|
||||||
|
(experimental)
|
||||||
|
|
||||||
## Video Selection:
|
## Video Selection:
|
||||||
--playlist-start NUMBER playlist video to start at (default is 1)
|
--playlist-start NUMBER playlist video to start at (default is 1)
|
||||||
--playlist-end NUMBER playlist video to end at (default is last)
|
--playlist-end NUMBER playlist video to end at (default is last)
|
||||||
|
--playlist-items ITEM_SPEC playlist video items to download. Specify
|
||||||
|
indices of the videos in the playlist
|
||||||
|
seperated by commas like: "--playlist-items
|
||||||
|
1,2,5,8" if you want to download videos
|
||||||
|
indexed 1, 2, 5, 8 in the playlist. You can
|
||||||
|
specify range: "--playlist-items
|
||||||
|
1-3,7,10-13", it will download the videos
|
||||||
|
at index 1, 2, 3, 7, 10, 11, 12 and 13.
|
||||||
--match-title REGEX download only matching titles (regex or
|
--match-title REGEX download only matching titles (regex or
|
||||||
caseless sub-string)
|
caseless sub-string)
|
||||||
--reject-title REGEX skip download for matching titles (regex or
|
--reject-title REGEX skip download for matching titles (regex or
|
||||||
@ -93,6 +120,23 @@ which means you can modify it, redistribute it or use it however you like.
|
|||||||
COUNT views
|
COUNT views
|
||||||
--max-views COUNT Do not download any videos with more than
|
--max-views COUNT Do not download any videos with more than
|
||||||
COUNT views
|
COUNT views
|
||||||
|
--match-filter FILTER (Experimental) Generic video filter.
|
||||||
|
Specify any key (see help for -o for a list
|
||||||
|
of available keys) to match if the key is
|
||||||
|
present, !key to check if the key is not
|
||||||
|
present,key > NUMBER (like "comment_count >
|
||||||
|
12", also works with >=, <, <=, !=, =) to
|
||||||
|
compare against a number, and & to require
|
||||||
|
multiple matches. Values which are not
|
||||||
|
known are excluded unless you put a
|
||||||
|
question mark (?) after the operator.For
|
||||||
|
example, to only match videos that have
|
||||||
|
been liked more than 100 times and disliked
|
||||||
|
less than 50 times (or the dislike
|
||||||
|
functionality is not available at the given
|
||||||
|
service), but who also have a description,
|
||||||
|
use --match-filter "like_count > 100 &
|
||||||
|
dislike_count <? 50 & description" .
|
||||||
--no-playlist If the URL refers to a video and a
|
--no-playlist If the URL refers to a video and a
|
||||||
playlist, download only the video.
|
playlist, download only the video.
|
||||||
--age-limit YEARS download only videos suitable for the given
|
--age-limit YEARS download only videos suitable for the given
|
||||||
@ -106,19 +150,27 @@ which means you can modify it, redistribute it or use it however you like.
|
|||||||
## Download Options:
|
## Download Options:
|
||||||
-r, --rate-limit LIMIT maximum download rate in bytes per second
|
-r, --rate-limit LIMIT maximum download rate in bytes per second
|
||||||
(e.g. 50K or 4.2M)
|
(e.g. 50K or 4.2M)
|
||||||
-R, --retries RETRIES number of retries (default is 10)
|
-R, --retries RETRIES number of retries (default is 10), or
|
||||||
|
"infinite".
|
||||||
--buffer-size SIZE size of download buffer (e.g. 1024 or 16K)
|
--buffer-size SIZE size of download buffer (e.g. 1024 or 16K)
|
||||||
(default is 1024)
|
(default is 1024)
|
||||||
--no-resize-buffer do not automatically adjust the buffer
|
--no-resize-buffer do not automatically adjust the buffer
|
||||||
size. By default, the buffer size is
|
size. By default, the buffer size is
|
||||||
automatically resized from an initial value
|
automatically resized from an initial value
|
||||||
of SIZE.
|
of SIZE.
|
||||||
|
--playlist-reverse Download playlist videos in reverse order
|
||||||
|
--xattr-set-filesize (experimental) set file xattribute
|
||||||
|
ytdl.filesize with expected filesize
|
||||||
|
--hls-prefer-native (experimental) Use the native HLS
|
||||||
|
downloader instead of ffmpeg.
|
||||||
|
--external-downloader COMMAND (experimental) Use the specified external
|
||||||
|
downloader. Currently supports
|
||||||
|
aria2c,curl,wget
|
||||||
|
|
||||||
## Filesystem Options:
|
## Filesystem Options:
|
||||||
-a, --batch-file FILE file containing URLs to download ('-' for
|
-a, --batch-file FILE file containing URLs to download ('-' for
|
||||||
stdin)
|
stdin)
|
||||||
--id use only video ID in file name
|
--id use only video ID in file name
|
||||||
-A, --auto-number number downloaded files starting from 00000
|
|
||||||
-o, --output TEMPLATE output filename template. Use %(title)s to
|
-o, --output TEMPLATE output filename template. Use %(title)s to
|
||||||
get the title, %(uploader)s for the
|
get the title, %(uploader)s for the
|
||||||
uploader name, %(uploader_id)s for the
|
uploader name, %(uploader_id)s for the
|
||||||
@ -152,6 +204,9 @@ which means you can modify it, redistribute it or use it however you like.
|
|||||||
--restrict-filenames Restrict filenames to only ASCII
|
--restrict-filenames Restrict filenames to only ASCII
|
||||||
characters, and avoid "&" and spaces in
|
characters, and avoid "&" and spaces in
|
||||||
filenames
|
filenames
|
||||||
|
-A, --auto-number [deprecated; use -o
|
||||||
|
"%(autonumber)s-%(title)s.%(ext)s" ] number
|
||||||
|
downloaded files starting from 00000
|
||||||
-t, --title [deprecated] use title in file name
|
-t, --title [deprecated] use title in file name
|
||||||
(default)
|
(default)
|
||||||
-l, --literal [deprecated] alias of --title
|
-l, --literal [deprecated] alias of --title
|
||||||
@ -170,7 +225,6 @@ which means you can modify it, redistribute it or use it however you like.
|
|||||||
--write-info-json write video metadata to a .info.json file
|
--write-info-json write video metadata to a .info.json file
|
||||||
--write-annotations write video annotations to a .annotation
|
--write-annotations write video annotations to a .annotation
|
||||||
file
|
file
|
||||||
--write-thumbnail write thumbnail image to disk
|
|
||||||
--load-info FILE json file containing the video information
|
--load-info FILE json file containing the video information
|
||||||
(created with the "--write-json" option)
|
(created with the "--write-json" option)
|
||||||
--cookies FILE file to read cookies from and dump cookie
|
--cookies FILE file to read cookies from and dump cookie
|
||||||
@ -185,6 +239,12 @@ which means you can modify it, redistribute it or use it however you like.
|
|||||||
--no-cache-dir Disable filesystem caching
|
--no-cache-dir Disable filesystem caching
|
||||||
--rm-cache-dir Delete all filesystem cache files
|
--rm-cache-dir Delete all filesystem cache files
|
||||||
|
|
||||||
|
## Thumbnail images:
|
||||||
|
--write-thumbnail write thumbnail image to disk
|
||||||
|
--write-all-thumbnails write all thumbnail image formats to disk
|
||||||
|
--list-thumbnails Simulate and list all available thumbnail
|
||||||
|
formats
|
||||||
|
|
||||||
## Verbosity / Simulation Options:
|
## Verbosity / Simulation Options:
|
||||||
-q, --quiet activates quiet mode
|
-q, --quiet activates quiet mode
|
||||||
--no-warnings Ignore warnings
|
--no-warnings Ignore warnings
|
||||||
@ -206,6 +266,8 @@ which means you can modify it, redistribute it or use it however you like.
|
|||||||
for each command-line argument. If the URL
|
for each command-line argument. If the URL
|
||||||
refers to a playlist, dump the whole
|
refers to a playlist, dump the whole
|
||||||
playlist information in a single line.
|
playlist information in a single line.
|
||||||
|
--print-json Be quiet and print the video information as
|
||||||
|
JSON (video is still being downloaded).
|
||||||
--newline output progress bar as new lines
|
--newline output progress bar as new lines
|
||||||
--no-progress do not print progress bar
|
--no-progress do not print progress bar
|
||||||
--console-title display progress in console titlebar
|
--console-title display progress in console titlebar
|
||||||
@ -216,6 +278,10 @@ which means you can modify it, redistribute it or use it however you like.
|
|||||||
files in the current directory to debug
|
files in the current directory to debug
|
||||||
problems
|
problems
|
||||||
--print-traffic Display sent and read HTTP traffic
|
--print-traffic Display sent and read HTTP traffic
|
||||||
|
-C, --call-home Contact the youtube-dl server for
|
||||||
|
debugging.
|
||||||
|
--no-call-home Do NOT contact the youtube-dl server for
|
||||||
|
debugging.
|
||||||
|
|
||||||
## Workarounds:
|
## Workarounds:
|
||||||
--encoding ENCODING Force the specified encoding (experimental)
|
--encoding ENCODING Force the specified encoding (experimental)
|
||||||
@ -232,17 +298,34 @@ which means you can modify it, redistribute it or use it however you like.
|
|||||||
--bidi-workaround Work around terminals that lack
|
--bidi-workaround Work around terminals that lack
|
||||||
bidirectional text support. Requires bidiv
|
bidirectional text support. Requires bidiv
|
||||||
or fribidi executable in PATH
|
or fribidi executable in PATH
|
||||||
|
--sleep-interval SECONDS Number of seconds to sleep before each
|
||||||
|
download.
|
||||||
|
|
||||||
## Video Format Options:
|
## Video Format Options:
|
||||||
-f, --format FORMAT video format code, specify the order of
|
-f, --format FORMAT video format code, specify the order of
|
||||||
preference using slashes: -f 22/17/18 . -f
|
preference using slashes, as in -f 22/17/18
|
||||||
mp4 , -f m4a and -f flv are also
|
. Instead of format codes, you can select
|
||||||
supported. You can also use the special
|
by extension for the extensions aac, m4a,
|
||||||
names "best", "bestvideo", "bestaudio",
|
mp3, mp4, ogg, wav, webm. You can also use
|
||||||
"worst", "worstvideo" and "worstaudio". By
|
the special names "best", "bestvideo",
|
||||||
default, youtube-dl will pick the best
|
"bestaudio", "worst". You can filter the
|
||||||
quality. Use commas to download multiple
|
video results by putting a condition in
|
||||||
audio formats, such as -f
|
brackets, as in -f "best[height=720]" (or
|
||||||
|
-f "[filesize>10M]"). This works for
|
||||||
|
filesize, height, width, tbr, abr, vbr,
|
||||||
|
asr, and fps and the comparisons <, <=, >,
|
||||||
|
>=, =, != and for ext, acodec, vcodec,
|
||||||
|
container, and protocol and the comparisons
|
||||||
|
=, != . Formats for which the value is not
|
||||||
|
known are excluded unless you put a
|
||||||
|
question mark (?) after the operator. You
|
||||||
|
can combine format filters, so -f "[height
|
||||||
|
<=? 720][tbr>500]" selects up to 720p
|
||||||
|
videos (or videos where the height is not
|
||||||
|
known) with a bitrate of at least 500
|
||||||
|
KBit/s. By default, youtube-dl will pick
|
||||||
|
the best quality. Use commas to download
|
||||||
|
multiple audio formats, such as -f
|
||||||
136/137/mp4/bestvideo,140/m4a/bestaudio.
|
136/137/mp4/bestvideo,140/m4a/bestaudio.
|
||||||
You can merge the video and audio of two
|
You can merge the video and audio of two
|
||||||
formats into a single file using -f <video-
|
formats into a single file using -f <video-
|
||||||
@ -256,6 +339,10 @@ which means you can modify it, redistribute it or use it however you like.
|
|||||||
-F, --list-formats list all available formats
|
-F, --list-formats list all available formats
|
||||||
--youtube-skip-dash-manifest Do not download the DASH manifest on
|
--youtube-skip-dash-manifest Do not download the DASH manifest on
|
||||||
YouTube videos
|
YouTube videos
|
||||||
|
--merge-output-format FORMAT If a merge is required (e.g.
|
||||||
|
bestvideo+bestaudio), output to given
|
||||||
|
container format. One of mkv, mp4, ogg,
|
||||||
|
webm, flv.Ignored if no merge is required
|
||||||
|
|
||||||
## Subtitle Options:
|
## Subtitle Options:
|
||||||
--write-sub write subtitle file
|
--write-sub write subtitle file
|
||||||
@ -272,7 +359,8 @@ which means you can modify it, redistribute it or use it however you like.
|
|||||||
|
|
||||||
## Authentication Options:
|
## Authentication Options:
|
||||||
-u, --username USERNAME login with this account ID
|
-u, --username USERNAME login with this account ID
|
||||||
-p, --password PASSWORD account password
|
-p, --password PASSWORD account password. If this option is left
|
||||||
|
out, youtube-dl will ask interactively.
|
||||||
-2, --twofactor TWOFACTOR two-factor auth code
|
-2, --twofactor TWOFACTOR two-factor auth code
|
||||||
-n, --netrc use .netrc authentication data
|
-n, --netrc use .netrc authentication data
|
||||||
--video-password PASSWORD video password (vimeo, smotri)
|
--video-password PASSWORD video password (vimeo, smotri)
|
||||||
@ -302,10 +390,18 @@ which means you can modify it, redistribute it or use it however you like.
|
|||||||
--add-metadata write metadata to the video file
|
--add-metadata write metadata to the video file
|
||||||
--xattrs write metadata to the video file's xattrs
|
--xattrs write metadata to the video file's xattrs
|
||||||
(using dublin core and xdg standards)
|
(using dublin core and xdg standards)
|
||||||
|
--fixup POLICY Automatically correct known faults of the
|
||||||
|
file. One of never (do nothing), warn (only
|
||||||
|
emit a warning), detect_or_warn(the
|
||||||
|
default; fix file if we can, warn
|
||||||
|
otherwise)
|
||||||
--prefer-avconv Prefer avconv over ffmpeg for running the
|
--prefer-avconv Prefer avconv over ffmpeg for running the
|
||||||
postprocessors (default)
|
postprocessors (default)
|
||||||
--prefer-ffmpeg Prefer ffmpeg over avconv for running the
|
--prefer-ffmpeg Prefer ffmpeg over avconv for running the
|
||||||
postprocessors
|
postprocessors
|
||||||
|
--ffmpeg-location PATH Location of the ffmpeg/avconv binary;
|
||||||
|
either the path to the binary or its
|
||||||
|
containing directory.
|
||||||
--exec CMD Execute a command on the file after
|
--exec CMD Execute a command on the file after
|
||||||
downloading, similar to find's -exec
|
downloading, similar to find's -exec
|
||||||
syntax. Example: --exec 'adb push {}
|
syntax. Example: --exec 'adb push {}
|
||||||
@ -313,7 +409,7 @@ which means you can modify it, redistribute it or use it however you like.
|
|||||||
|
|
||||||
# CONFIGURATION
|
# CONFIGURATION
|
||||||
|
|
||||||
You can configure youtube-dl by placing default arguments (such as `--extract-audio --no-mtime` to always extract the audio and not copy the mtime) into `/etc/youtube-dl.conf` and/or `~/.config/youtube-dl/config`. On Windows, the configuration file locations are `%APPDATA%\youtube-dl\config.txt` and `C:\Users\<Yourname>\youtube-dl.conf`.
|
You can configure youtube-dl by placing default arguments (such as `--extract-audio --no-mtime` to always extract the audio and not copy the mtime) into `/etc/youtube-dl.conf` and/or `~/.config/youtube-dl/config`. On Windows, the configuration file locations are `%APPDATA%\youtube-dl\config.txt` and `C:\Users\<user name>\youtube-dl.conf`.
|
||||||
|
|
||||||
# OUTPUT TEMPLATE
|
# OUTPUT TEMPLATE
|
||||||
|
|
||||||
@ -407,17 +503,27 @@ Apparently YouTube requires you to pass a CAPTCHA test if you download too much.
|
|||||||
|
|
||||||
Once the video is fully downloaded, use any video player, such as [vlc](http://www.videolan.org) or [mplayer](http://www.mplayerhq.hu/).
|
Once the video is fully downloaded, use any video player, such as [vlc](http://www.videolan.org) or [mplayer](http://www.mplayerhq.hu/).
|
||||||
|
|
||||||
### The links provided by youtube-dl -g are not working anymore
|
### I extracted a video URL with -g, but it does not play on another machine / in my webbrowser.
|
||||||
|
|
||||||
The URLs youtube-dl outputs require the downloader to have the correct cookies. Use the `--cookies` option to write the required cookies into a file, and advise your downloader to read cookies from that file. Some sites also require a common user agent to be used, use `--dump-user-agent` to see the one in use by youtube-dl.
|
It depends a lot on the service. In many cases, requests for the video (to download/play it) must come from the same IP address and with the same cookies. Use the `--cookies` option to write the required cookies into a file, and advise your downloader to read cookies from that file. Some sites also require a common user agent to be used, use `--dump-user-agent` to see the one in use by youtube-dl.
|
||||||
|
|
||||||
|
It may be beneficial to use IPv6; in some cases, the restrictions are only applied to IPv4. Some services (sometimes only for a subset of videos) do not restrict the video URL by IP address, cookie, or user-agent, but these are the exception rather than the rule.
|
||||||
|
|
||||||
|
Please bear in mind that some URL protocols are **not** supported by browsers out of the box, including RTMP. If you are using -g, your own downloader must support these as well.
|
||||||
|
|
||||||
|
If you want to play the video on a machine that is not running youtube-dl, you can relay the video content from the machine that runs youtube-dl. You can use `-o -` to let youtube-dl stream a video to stdout, or simply allow the player to download the files written by youtube-dl in turn.
|
||||||
|
|
||||||
### ERROR: no fmt_url_map or conn information found in video info
|
### ERROR: no fmt_url_map or conn information found in video info
|
||||||
|
|
||||||
youtube has switched to a new video info format in July 2011 which is not supported by old versions of youtube-dl. You can update youtube-dl with `sudo youtube-dl --update`.
|
YouTube has switched to a new video info format in July 2011 which is not supported by old versions of youtube-dl. See [above](#how-do-i-update-youtube-dl) for how to update youtube-dl.
|
||||||
|
|
||||||
### ERROR: unable to download video ###
|
### ERROR: unable to download video ###
|
||||||
|
|
||||||
youtube requires an additional signature since September 2012 which is not supported by old versions of youtube-dl. You can update youtube-dl with `sudo youtube-dl --update`.
|
YouTube requires an additional signature since September 2012 which is not supported by old versions of youtube-dl. See [above](#how-do-i-update-youtube-dl) for how to update youtube-dl.
|
||||||
|
|
||||||
|
### ExtractorError: Could not find JS function u'OF'
|
||||||
|
|
||||||
|
In February 2015, the new YouTube player contained a character sequence in a string that was misinterpreted by old versions of youtube-dl. See [above](#how-do-i-update-youtube-dl) for how to update youtube-dl.
|
||||||
|
|
||||||
### SyntaxError: Non-ASCII character ###
|
### SyntaxError: Non-ASCII character ###
|
||||||
|
|
||||||
@ -436,6 +542,41 @@ Since June 2012 (#342) youtube-dl is packed as an executable zipfile, simply unz
|
|||||||
|
|
||||||
To run the exe you need to install first the [Microsoft Visual C++ 2008 Redistributable Package](http://www.microsoft.com/en-us/download/details.aspx?id=29).
|
To run the exe you need to install first the [Microsoft Visual C++ 2008 Redistributable Package](http://www.microsoft.com/en-us/download/details.aspx?id=29).
|
||||||
|
|
||||||
|
### On Windows, how should I set up ffmpeg and youtube-dl? Where should I put the exe files?
|
||||||
|
|
||||||
|
If you put youtube-dl and ffmpeg in the same directory that you're running the command from, it will work, but that's rather cumbersome.
|
||||||
|
|
||||||
|
To make a different directory work - either for ffmpeg, or for youtube-dl, or for both - simply create the directory (say, `C:\bin`, or `C:\Users\<User name>\bin`), put all the executables directly in there, and then [set your PATH environment variable](https://www.java.com/en/download/help/path.xml) to include that directory.
|
||||||
|
|
||||||
|
From then on, after restarting your shell, you will be able to access both youtube-dl and ffmpeg (and youtube-dl will be able to find ffmpeg) by simply typing `youtube-dl` or `ffmpeg`, no matter what directory you're in.
|
||||||
|
|
||||||
|
### How do I put downloads into a specific folder?
|
||||||
|
|
||||||
|
Use the `-o` to specify an [output template](#output-template), for example `-o "/home/user/videos/%(title)s-%(id)s.%(ext)s"`. If you want this for all of your downloads, put the option into your [configuration file](#configuration).
|
||||||
|
|
||||||
|
### How do I download a video starting with a `-` ?
|
||||||
|
|
||||||
|
Either prepend `http://www.youtube.com/watch?v=` or separate the ID from the options with `--`:
|
||||||
|
|
||||||
|
youtube-dl -- -wNyEUrxzFU
|
||||||
|
youtube-dl "http://www.youtube.com/watch?v=-wNyEUrxzFU"
|
||||||
|
|
||||||
|
### Can you add support for this anime video site, or site which shows current movies for free?
|
||||||
|
|
||||||
|
As a matter of policy (as well as legality), youtube-dl does not include support for services that specialize in infringing copyright. As a rule of thumb, if you cannot easily find a video that the service is quite obviously allowed to distribute (i.e. that has been uploaded by the creator, the creator's distributor, or is published under a free license), the service is probably unfit for inclusion to youtube-dl.
|
||||||
|
|
||||||
|
A note on the service that they don't host the infringing content, but just link to those who do, is evidence that the service should **not** be included into youtube-dl. The same goes for any DMCA note when the whole front page of the service is filled with videos they are not allowed to distribute. A "fair use" note is equally unconvincing if the service shows copyright-protected videos in full without authorization.
|
||||||
|
|
||||||
|
Support requests for services that **do** purchase the rights to distribute their content are perfectly fine though. If in doubt, you can simply include a source that mentions the legitimate purchase of content.
|
||||||
|
|
||||||
|
### How can I detect whether a given URL is supported by youtube-dl?
|
||||||
|
|
||||||
|
For one, have a look at the [list of supported sites](docs/supportedsites.md). Note that it can sometimes happen that the site changes its URL scheme (say, from http://example.com/video/1234567 to http://example.com/v/1234567 ) and youtube-dl reports an URL of a service in that list as unsupported. In that case, simply report a bug.
|
||||||
|
|
||||||
|
It is *not* possible to detect whether a URL is supported or not. That's because youtube-dl contains a generic extractor which matches **all** URLs. You may be tempted to disable, exclude, or remove the generic extractor, but the generic extractor not only allows users to extract videos from lots of websites that embed a video from another service, but may also be used to extract video from a service that it's hosting itself. Therefore, we neither recommend nor support disabling, excluding, or removing the generic extractor.
|
||||||
|
|
||||||
|
If you want to find out whether a given URL is supported, simply call youtube-dl with it. If you get no videos back, chances are the URL is either not referring to a video or unsupported. You can find out which by examining the output (if you run youtube-dl on the console) or catching an `UnsupportedError` exception if you run it from a Python program.
|
||||||
|
|
||||||
# DEVELOPER INSTRUCTIONS
|
# DEVELOPER INSTRUCTIONS
|
||||||
|
|
||||||
Most users do not need to build youtube-dl and can [download the builds](http://rg3.github.io/youtube-dl/download.html) or get them from their distribution.
|
Most users do not need to build youtube-dl and can [download the builds](http://rg3.github.io/youtube-dl/download.html) or get them from their distribution.
|
||||||
@ -508,7 +649,7 @@ If you want to add support for a new site, you can follow this quick list (assum
|
|||||||
5. Add an import in [`youtube_dl/extractor/__init__.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/__init__.py).
|
5. Add an import in [`youtube_dl/extractor/__init__.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/__init__.py).
|
||||||
6. Run `python test/test_download.py TestDownload.test_YourExtractor`. This *should fail* at first, but you can continually re-run it until you're done. If you decide to add more than one test, then rename ``_TEST`` to ``_TESTS`` and make it into a list of dictionaries. The tests will be then be named `TestDownload.test_YourExtractor`, `TestDownload.test_YourExtractor_1`, `TestDownload.test_YourExtractor_2`, etc.
|
6. Run `python test/test_download.py TestDownload.test_YourExtractor`. This *should fail* at first, but you can continually re-run it until you're done. If you decide to add more than one test, then rename ``_TEST`` to ``_TESTS`` and make it into a list of dictionaries. The tests will be then be named `TestDownload.test_YourExtractor`, `TestDownload.test_YourExtractor_1`, `TestDownload.test_YourExtractor_2`, etc.
|
||||||
7. Have a look at [`youtube_dl/common/extractor/common.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py) for possible helper methods and a [detailed description of what your extractor should return](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py#L38). Add tests and code for as many as you want.
|
7. Have a look at [`youtube_dl/common/extractor/common.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py) for possible helper methods and a [detailed description of what your extractor should return](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py#L38). Add tests and code for as many as you want.
|
||||||
8. If you can, check the code with [pyflakes](https://pypi.python.org/pypi/pyflakes) (a good idea) and [pep8](https://pypi.python.org/pypi/pep8) (optional, ignore E501).
|
8. If you can, check the code with [flake8](https://pypi.python.org/pypi/flake8).
|
||||||
9. When the tests pass, [add](http://git-scm.com/docs/git-add) the new files and [commit](http://git-scm.com/docs/git-commit) them and [push](http://git-scm.com/docs/git-push) the result, like this:
|
9. When the tests pass, [add](http://git-scm.com/docs/git-add) the new files and [commit](http://git-scm.com/docs/git-commit) them and [push](http://git-scm.com/docs/git-push) the result, like this:
|
||||||
|
|
||||||
$ git add youtube_dl/extractor/__init__.py
|
$ git add youtube_dl/extractor/__init__.py
|
||||||
@ -526,23 +667,61 @@ youtube-dl makes the best effort to be a good command-line program, and thus sho
|
|||||||
|
|
||||||
From a Python program, you can embed youtube-dl in a more powerful fashion, like this:
|
From a Python program, you can embed youtube-dl in a more powerful fashion, like this:
|
||||||
|
|
||||||
import youtube_dl
|
```python
|
||||||
|
import youtube_dl
|
||||||
|
|
||||||
ydl_opts = {}
|
ydl_opts = {}
|
||||||
with youtube_dl.YoutubeDL(ydl_opts) as ydl:
|
with youtube_dl.YoutubeDL(ydl_opts) as ydl:
|
||||||
ydl.download(['http://www.youtube.com/watch?v=BaW_jenozKc'])
|
ydl.download(['http://www.youtube.com/watch?v=BaW_jenozKc'])
|
||||||
|
```
|
||||||
|
|
||||||
Most likely, you'll want to use various options. For a list of what can be done, have a look at [youtube_dl/YoutubeDL.py](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/YoutubeDL.py#L69). For a start, if you want to intercept youtube-dl's output, set a `logger` object.
|
Most likely, you'll want to use various options. For a list of what can be done, have a look at [youtube_dl/YoutubeDL.py](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/YoutubeDL.py#L69). For a start, if you want to intercept youtube-dl's output, set a `logger` object.
|
||||||
|
|
||||||
|
Here's a more complete example of a program that outputs only errors (and a short message after the download is finished), and downloads/converts the video to an mp3 file:
|
||||||
|
|
||||||
|
```python
|
||||||
|
import youtube_dl
|
||||||
|
|
||||||
|
|
||||||
|
class MyLogger(object):
|
||||||
|
def debug(self, msg):
|
||||||
|
pass
|
||||||
|
|
||||||
|
def warning(self, msg):
|
||||||
|
pass
|
||||||
|
|
||||||
|
def error(self, msg):
|
||||||
|
print(msg)
|
||||||
|
|
||||||
|
|
||||||
|
def my_hook(d):
|
||||||
|
if d['status'] == 'finished':
|
||||||
|
print('Done downloading, now converting ...')
|
||||||
|
|
||||||
|
|
||||||
|
ydl_opts = {
|
||||||
|
'format': 'bestaudio/best',
|
||||||
|
'postprocessors': [{
|
||||||
|
'key': 'FFmpegExtractAudio',
|
||||||
|
'preferredcodec': 'mp3',
|
||||||
|
'preferredquality': '192',
|
||||||
|
}],
|
||||||
|
'logger': MyLogger(),
|
||||||
|
'progress_hooks': [my_hook],
|
||||||
|
}
|
||||||
|
with youtube_dl.YoutubeDL(ydl_opts) as ydl:
|
||||||
|
ydl.download(['http://www.youtube.com/watch?v=BaW_jenozKc'])
|
||||||
|
```
|
||||||
|
|
||||||
# BUGS
|
# BUGS
|
||||||
|
|
||||||
Bugs and suggestions should be reported at: <https://github.com/rg3/youtube-dl/issues> . Unless you were prompted so or there is another pertinent reason (e.g. GitHub fails to accept the bug report), please do not send bug reports via personal email.
|
Bugs and suggestions should be reported at: <https://github.com/rg3/youtube-dl/issues> . Unless you were prompted so or there is another pertinent reason (e.g. GitHub fails to accept the bug report), please do not send bug reports via personal email. For discussions, join us in the irc channel #youtube-dl on freenode.
|
||||||
|
|
||||||
Please include the full output of the command when run with `--verbose`. The output (including the first lines) contain important debugging information. Issues without the full output are often not reproducible and therefore do not get solved in short order, if ever.
|
**Please include the full output of youtube-dl when run with `-v`**.
|
||||||
|
|
||||||
For discussions, join us in the irc channel #youtube-dl on freenode.
|
The output (including the first lines) contain important debugging information. Issues without the full output are often not reproducible and therefore do not get solved in short order, if ever.
|
||||||
|
|
||||||
When you submit a request, please re-read it once to avoid a couple of mistakes (you can and should use this as a checklist):
|
Please re-read your issue once again to avoid a couple of common mistakes (you can and should use this as a checklist):
|
||||||
|
|
||||||
### Is the description of the issue itself sufficient?
|
### Is the description of the issue itself sufficient?
|
||||||
|
|
||||||
@ -586,7 +765,7 @@ In particular, every site support request issue should only pertain to services
|
|||||||
|
|
||||||
### Is anyone going to need the feature?
|
### Is anyone going to need the feature?
|
||||||
|
|
||||||
Only post features that you (or an incapicated friend you can personally talk to) require. Do not post features because they seem like a good idea. If they are really useful, they will be requested by someone who requires them.
|
Only post features that you (or an incapacitated friend you can personally talk to) require. Do not post features because they seem like a good idea. If they are really useful, they will be requested by someone who requires them.
|
||||||
|
|
||||||
### Is your question about youtube-dl?
|
### Is your question about youtube-dl?
|
||||||
|
|
||||||
|
@ -16,7 +16,7 @@ def main():
|
|||||||
template = tmplf.read()
|
template = tmplf.read()
|
||||||
|
|
||||||
ie_htmls = []
|
ie_htmls = []
|
||||||
for ie in sorted(youtube_dl.gen_extractors(), key=lambda i: i.IE_NAME.lower()):
|
for ie in youtube_dl.list_extractors(age_limit=None):
|
||||||
ie_html = '<b>{}</b>'.format(ie.IE_NAME)
|
ie_html = '<b>{}</b>'.format(ie.IE_NAME)
|
||||||
ie_desc = getattr(ie, 'IE_DESC', None)
|
ie_desc = getattr(ie, 'IE_DESC', None)
|
||||||
if ie_desc is False:
|
if ie_desc is False:
|
||||||
|
32
devscripts/make_contributing.py
Executable file
32
devscripts/make_contributing.py
Executable file
@ -0,0 +1,32 @@
|
|||||||
|
#!/usr/bin/env python
|
||||||
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
import io
|
||||||
|
import optparse
|
||||||
|
import re
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
parser = optparse.OptionParser(usage='%prog INFILE OUTFILE')
|
||||||
|
options, args = parser.parse_args()
|
||||||
|
if len(args) != 2:
|
||||||
|
parser.error('Expected an input and an output filename')
|
||||||
|
|
||||||
|
infile, outfile = args
|
||||||
|
|
||||||
|
with io.open(infile, encoding='utf-8') as inf:
|
||||||
|
readme = inf.read()
|
||||||
|
|
||||||
|
bug_text = re.search(
|
||||||
|
r'(?s)#\s*BUGS\s*[^\n]*\s*(.*?)#\s*COPYRIGHT', readme).group(1)
|
||||||
|
dev_text = re.search(
|
||||||
|
r'(?s)(#\s*DEVELOPER INSTRUCTIONS.*?)#\s*EMBEDDING YOUTUBE-DL',
|
||||||
|
readme).group(1)
|
||||||
|
|
||||||
|
out = bug_text + dev_text
|
||||||
|
|
||||||
|
with io.open(outfile, 'w', encoding='utf-8') as outf:
|
||||||
|
outf.write(out)
|
||||||
|
|
||||||
|
if __name__ == '__main__':
|
||||||
|
main()
|
45
devscripts/make_supportedsites.py
Normal file
45
devscripts/make_supportedsites.py
Normal file
@ -0,0 +1,45 @@
|
|||||||
|
#!/usr/bin/env python
|
||||||
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
import io
|
||||||
|
import optparse
|
||||||
|
import os
|
||||||
|
import sys
|
||||||
|
|
||||||
|
|
||||||
|
# Import youtube_dl
|
||||||
|
ROOT_DIR = os.path.join(os.path.dirname(__file__), '..')
|
||||||
|
sys.path.append(ROOT_DIR)
|
||||||
|
import youtube_dl
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
parser = optparse.OptionParser(usage='%prog OUTFILE.md')
|
||||||
|
options, args = parser.parse_args()
|
||||||
|
if len(args) != 1:
|
||||||
|
parser.error('Expected an output filename')
|
||||||
|
|
||||||
|
outfile, = args
|
||||||
|
|
||||||
|
def gen_ies_md(ies):
|
||||||
|
for ie in ies:
|
||||||
|
ie_md = '**{0}**'.format(ie.IE_NAME)
|
||||||
|
ie_desc = getattr(ie, 'IE_DESC', None)
|
||||||
|
if ie_desc is False:
|
||||||
|
continue
|
||||||
|
if ie_desc is not None:
|
||||||
|
ie_md += ': {0}'.format(ie.IE_DESC)
|
||||||
|
if not ie.working():
|
||||||
|
ie_md += ' (Currently broken)'
|
||||||
|
yield ie_md
|
||||||
|
|
||||||
|
ies = sorted(youtube_dl.gen_extractors(), key=lambda i: i.IE_NAME.lower())
|
||||||
|
out = '# Supported sites\n' + ''.join(
|
||||||
|
' - ' + md + '\n'
|
||||||
|
for md in gen_ies_md(ies))
|
||||||
|
|
||||||
|
with io.open(outfile, 'w', encoding='utf-8') as outf:
|
||||||
|
outf.write(out)
|
||||||
|
|
||||||
|
if __name__ == '__main__':
|
||||||
|
main()
|
@ -11,8 +11,19 @@ README_FILE = os.path.join(ROOT_DIR, 'README.md')
|
|||||||
with io.open(README_FILE, encoding='utf-8') as f:
|
with io.open(README_FILE, encoding='utf-8') as f:
|
||||||
readme = f.read()
|
readme = f.read()
|
||||||
|
|
||||||
PREFIX = '%YOUTUBE-DL(1)\n\n# NAME\n'
|
PREFIX = '''%YOUTUBE-DL(1)
|
||||||
readme = re.sub(r'(?s)# INSTALLATION.*?(?=# DESCRIPTION)', '', readme)
|
|
||||||
|
# NAME
|
||||||
|
|
||||||
|
youtube\-dl \- download videos from youtube.com or other video platforms
|
||||||
|
|
||||||
|
# SYNOPSIS
|
||||||
|
|
||||||
|
**youtube-dl** \[OPTIONS\] URL [URL...]
|
||||||
|
|
||||||
|
'''
|
||||||
|
readme = re.sub(r'(?s)^.*?(?=# DESCRIPTION)', '', readme)
|
||||||
|
readme = re.sub(r'\s+youtube-dl \[OPTIONS\] URL \[URL\.\.\.\]', '', readme)
|
||||||
readme = PREFIX + readme
|
readme = PREFIX + readme
|
||||||
|
|
||||||
if sys.version_info < (3, 0):
|
if sys.version_info < (3, 0):
|
||||||
|
@ -35,7 +35,7 @@ if [ ! -z "$useless_files" ]; then echo "ERROR: Non-.py files in youtube_dl: $us
|
|||||||
if [ ! -f "updates_key.pem" ]; then echo 'ERROR: updates_key.pem missing'; exit 1; fi
|
if [ ! -f "updates_key.pem" ]; then echo 'ERROR: updates_key.pem missing'; exit 1; fi
|
||||||
|
|
||||||
/bin/echo -e "\n### First of all, testing..."
|
/bin/echo -e "\n### First of all, testing..."
|
||||||
make cleanall
|
make clean
|
||||||
if $skip_tests ; then
|
if $skip_tests ; then
|
||||||
echo 'SKIPPING TESTS'
|
echo 'SKIPPING TESTS'
|
||||||
else
|
else
|
||||||
@ -45,9 +45,9 @@ fi
|
|||||||
/bin/echo -e "\n### Changing version in version.py..."
|
/bin/echo -e "\n### Changing version in version.py..."
|
||||||
sed -i "s/__version__ = '.*'/__version__ = '$version'/" youtube_dl/version.py
|
sed -i "s/__version__ = '.*'/__version__ = '$version'/" youtube_dl/version.py
|
||||||
|
|
||||||
/bin/echo -e "\n### Committing README.md and youtube_dl/version.py..."
|
/bin/echo -e "\n### Committing documentation and youtube_dl/version.py..."
|
||||||
make README.md
|
make README.md CONTRIBUTING.md supportedsites
|
||||||
git add README.md youtube_dl/version.py
|
git add README.md CONTRIBUTING.md docs/supportedsites.md youtube_dl/version.py
|
||||||
git commit -m "release $version"
|
git commit -m "release $version"
|
||||||
|
|
||||||
/bin/echo -e "\n### Now tagging, signing and pushing..."
|
/bin/echo -e "\n### Now tagging, signing and pushing..."
|
||||||
|
565
docs/supportedsites.md
Normal file
565
docs/supportedsites.md
Normal file
@ -0,0 +1,565 @@
|
|||||||
|
# Supported sites
|
||||||
|
- **1tv**: Первый канал
|
||||||
|
- **1up.com**
|
||||||
|
- **220.ro**
|
||||||
|
- **24video**
|
||||||
|
- **3sat**
|
||||||
|
- **4tube**
|
||||||
|
- **56.com**
|
||||||
|
- **5min**
|
||||||
|
- **8tracks**
|
||||||
|
- **9gag**
|
||||||
|
- **abc.net.au**
|
||||||
|
- **Abc7News**
|
||||||
|
- **AcademicEarth:Course**
|
||||||
|
- **AddAnime**
|
||||||
|
- **AdobeTV**
|
||||||
|
- **AdultSwim**
|
||||||
|
- **Aftenposten**
|
||||||
|
- **Aftonbladet**
|
||||||
|
- **AlJazeera**
|
||||||
|
- **Allocine**
|
||||||
|
- **AlphaPorno**
|
||||||
|
- **anitube.se**
|
||||||
|
- **AnySex**
|
||||||
|
- **Aparat**
|
||||||
|
- **AppleDailyAnimationNews**
|
||||||
|
- **AppleDailyRealtimeNews**
|
||||||
|
- **AppleTrailers**
|
||||||
|
- **archive.org**: archive.org videos
|
||||||
|
- **ARD**
|
||||||
|
- **ARD:mediathek**
|
||||||
|
- **arte.tv**
|
||||||
|
- **arte.tv:+7**
|
||||||
|
- **arte.tv:concert**
|
||||||
|
- **arte.tv:creative**
|
||||||
|
- **arte.tv:ddc**
|
||||||
|
- **arte.tv:embed**
|
||||||
|
- **arte.tv:future**
|
||||||
|
- **AtresPlayer**
|
||||||
|
- **ATTTechChannel**
|
||||||
|
- **audiomack**
|
||||||
|
- **audiomack:album**
|
||||||
|
- **Azubu**
|
||||||
|
- **bambuser**
|
||||||
|
- **bambuser:channel**
|
||||||
|
- **Bandcamp**
|
||||||
|
- **Bandcamp:album**
|
||||||
|
- **bbc.co.uk**: BBC iPlayer
|
||||||
|
- **Beeg**
|
||||||
|
- **BehindKink**
|
||||||
|
- **Bet**
|
||||||
|
- **Bild**: Bild.de
|
||||||
|
- **BiliBili**
|
||||||
|
- **blinkx**
|
||||||
|
- **blip.tv:user**
|
||||||
|
- **BlipTV**
|
||||||
|
- **Bloomberg**
|
||||||
|
- **Bpb**: Bundeszentrale für politische Bildung
|
||||||
|
- **BR**: Bayerischer Rundfunk Mediathek
|
||||||
|
- **Break**
|
||||||
|
- **Brightcove**
|
||||||
|
- **BuzzFeed**
|
||||||
|
- **BYUtv**
|
||||||
|
- **Camdemy**
|
||||||
|
- **CamdemyFolder**
|
||||||
|
- **Canal13cl**
|
||||||
|
- **canalc2.tv**
|
||||||
|
- **Canalplus**: canalplus.fr, piwiplus.fr and d8.tv
|
||||||
|
- **CBS**
|
||||||
|
- **CBSNews**: CBS News
|
||||||
|
- **CBSSports**
|
||||||
|
- **CeskaTelevize**
|
||||||
|
- **channel9**: Channel 9
|
||||||
|
- **Chilloutzone**
|
||||||
|
- **Cinchcast**
|
||||||
|
- **Cinemassacre**
|
||||||
|
- **clipfish**
|
||||||
|
- **cliphunter**
|
||||||
|
- **Clipsyndicate**
|
||||||
|
- **Cloudy**
|
||||||
|
- **Clubic**
|
||||||
|
- **cmt.com**
|
||||||
|
- **CNET**
|
||||||
|
- **CNN**
|
||||||
|
- **CNNArticle**
|
||||||
|
- **CNNBlogs**
|
||||||
|
- **CollegeHumor**
|
||||||
|
- **CollegeRama**
|
||||||
|
- **ComCarCoff**
|
||||||
|
- **ComedyCentral**
|
||||||
|
- **ComedyCentralShows**: The Daily Show / The Colbert Report
|
||||||
|
- **CondeNast**: Condé Nast media group: Condé Nast, GQ, Glamour, Vanity Fair, Vogue, W Magazine, WIRED
|
||||||
|
- **Cracked**
|
||||||
|
- **Criterion**
|
||||||
|
- **Crunchyroll**
|
||||||
|
- **crunchyroll:playlist**
|
||||||
|
- **CSpan**: C-SPAN
|
||||||
|
- **CtsNews**
|
||||||
|
- **culturebox.francetvinfo.fr**
|
||||||
|
- **dailymotion**
|
||||||
|
- **dailymotion:playlist**
|
||||||
|
- **dailymotion:user**
|
||||||
|
- **daum.net**
|
||||||
|
- **DBTV**
|
||||||
|
- **DctpTv**
|
||||||
|
- **DeezerPlaylist**
|
||||||
|
- **defense.gouv.fr**
|
||||||
|
- **Discovery**
|
||||||
|
- **divxstage**: DivxStage
|
||||||
|
- **Dotsub**
|
||||||
|
- **DRBonanza**
|
||||||
|
- **Dropbox**
|
||||||
|
- **DrTuber**
|
||||||
|
- **DRTV**
|
||||||
|
- **Dump**
|
||||||
|
- **dvtv**: http://video.aktualne.cz/
|
||||||
|
- **EbaumsWorld**
|
||||||
|
- **EchoMsk**
|
||||||
|
- **eHow**
|
||||||
|
- **Einthusan**
|
||||||
|
- **eitb.tv**
|
||||||
|
- **EllenTV**
|
||||||
|
- **EllenTV:clips**
|
||||||
|
- **ElPais**: El País
|
||||||
|
- **Embedly**
|
||||||
|
- **EMPFlix**
|
||||||
|
- **Engadget**
|
||||||
|
- **Eporner**
|
||||||
|
- **EroProfile**
|
||||||
|
- **Escapist**
|
||||||
|
- **EveryonesMixtape**
|
||||||
|
- **exfm**: ex.fm
|
||||||
|
- **ExpoTV**
|
||||||
|
- **ExtremeTube**
|
||||||
|
- **facebook**
|
||||||
|
- **faz.net**
|
||||||
|
- **fc2**
|
||||||
|
- **fernsehkritik.tv**
|
||||||
|
- **fernsehkritik.tv:postecke**
|
||||||
|
- **Firedrive**
|
||||||
|
- **Firstpost**
|
||||||
|
- **Flickr**
|
||||||
|
- **Folketinget**: Folketinget (ft.dk; Danish parliament)
|
||||||
|
- **Foxgay**
|
||||||
|
- **FoxNews**
|
||||||
|
- **france2.fr:generation-quoi**
|
||||||
|
- **FranceCulture**
|
||||||
|
- **FranceInter**
|
||||||
|
- **francetv**: France 2, 3, 4, 5 and Ô
|
||||||
|
- **francetvinfo.fr**
|
||||||
|
- **Freesound**
|
||||||
|
- **freespeech.org**
|
||||||
|
- **FreeVideo**
|
||||||
|
- **FunnyOrDie**
|
||||||
|
- **Gamekings**
|
||||||
|
- **GameOne**
|
||||||
|
- **gameone:playlist**
|
||||||
|
- **GameSpot**
|
||||||
|
- **GameStar**
|
||||||
|
- **Gametrailers**
|
||||||
|
- **GDCVault**
|
||||||
|
- **generic**: Generic downloader that works on some sites
|
||||||
|
- **GiantBomb**
|
||||||
|
- **Giga**
|
||||||
|
- **Glide**: Glide mobile video messages (glide.me)
|
||||||
|
- **Globo**
|
||||||
|
- **GodTube**
|
||||||
|
- **GoldenMoustache**
|
||||||
|
- **Golem**
|
||||||
|
- **GorillaVid**: GorillaVid.in, daclips.in, movpod.in and fastvideo.in
|
||||||
|
- **Goshgay**
|
||||||
|
- **Grooveshark**
|
||||||
|
- **Groupon**
|
||||||
|
- **Hark**
|
||||||
|
- **HearThisAt**
|
||||||
|
- **Heise**
|
||||||
|
- **HellPorno**
|
||||||
|
- **Helsinki**: helsinki.fi
|
||||||
|
- **HentaiStigma**
|
||||||
|
- **HistoricFilms**
|
||||||
|
- **History**
|
||||||
|
- **hitbox**
|
||||||
|
- **hitbox:live**
|
||||||
|
- **HornBunny**
|
||||||
|
- **HostingBulk**
|
||||||
|
- **HotNewHipHop**
|
||||||
|
- **Howcast**
|
||||||
|
- **HowStuffWorks**
|
||||||
|
- **HuffPost**: Huffington Post
|
||||||
|
- **Hypem**
|
||||||
|
- **Iconosquare**
|
||||||
|
- **ign.com**
|
||||||
|
- **imdb**: Internet Movie Database trailers
|
||||||
|
- **imdb:list**: Internet Movie Database lists
|
||||||
|
- **Imgur**
|
||||||
|
- **Ina**
|
||||||
|
- **InfoQ**
|
||||||
|
- **Instagram**
|
||||||
|
- **instagram:user**: Instagram user profile
|
||||||
|
- **InternetVideoArchive**
|
||||||
|
- **IPrima**
|
||||||
|
- **ivi**: ivi.ru
|
||||||
|
- **ivi:compilation**: ivi.ru compilations
|
||||||
|
- **Izlesene**
|
||||||
|
- **JadoreCettePub**
|
||||||
|
- **JeuxVideo**
|
||||||
|
- **Jove**
|
||||||
|
- **jpopsuki.tv**
|
||||||
|
- **Jukebox**
|
||||||
|
- **Kankan**
|
||||||
|
- **Karaoketv**
|
||||||
|
- **keek**
|
||||||
|
- **KeezMovies**
|
||||||
|
- **KhanAcademy**
|
||||||
|
- **KickStarter**
|
||||||
|
- **kontrtube**: KontrTube.ru - Труба зовёт
|
||||||
|
- **KrasView**: Красвью
|
||||||
|
- **Ku6**
|
||||||
|
- **la7.tv**
|
||||||
|
- **Laola1Tv**
|
||||||
|
- **lifenews**: LIFE | NEWS
|
||||||
|
- **LiveLeak**
|
||||||
|
- **livestream**
|
||||||
|
- **livestream:original**
|
||||||
|
- **LnkGo**
|
||||||
|
- **lrt.lt**
|
||||||
|
- **lynda**: lynda.com videos
|
||||||
|
- **lynda:course**: lynda.com online courses
|
||||||
|
- **m6**
|
||||||
|
- **macgamestore**: MacGameStore trailers
|
||||||
|
- **mailru**: Видео@Mail.Ru
|
||||||
|
- **Malemotion**
|
||||||
|
- **MDR**
|
||||||
|
- **media.ccc.de**
|
||||||
|
- **metacafe**
|
||||||
|
- **Metacritic**
|
||||||
|
- **Mgoon**
|
||||||
|
- **Minhateca**
|
||||||
|
- **MinistryGrid**
|
||||||
|
- **mitele.es**
|
||||||
|
- **mixcloud**
|
||||||
|
- **MLB**
|
||||||
|
- **MoeVideo**: LetitBit video services: moevideo.net, playreplay.net and videochart.net
|
||||||
|
- **Mofosex**
|
||||||
|
- **Mojvideo**
|
||||||
|
- **Moniker**: allmyvideos.net and vidspot.net
|
||||||
|
- **mooshare**: Mooshare.biz
|
||||||
|
- **Morningstar**: morningstar.com
|
||||||
|
- **Motherless**
|
||||||
|
- **Motorsport**: motorsport.com
|
||||||
|
- **MovieClips**
|
||||||
|
- **Moviezine**
|
||||||
|
- **movshare**: MovShare
|
||||||
|
- **MPORA**
|
||||||
|
- **MTV**
|
||||||
|
- **mtviggy.com**
|
||||||
|
- **mtvservices:embedded**
|
||||||
|
- **MuenchenTV**: münchen.tv
|
||||||
|
- **MusicPlayOn**
|
||||||
|
- **MusicVault**
|
||||||
|
- **muzu.tv**
|
||||||
|
- **MySpace**
|
||||||
|
- **MySpace:album**
|
||||||
|
- **MySpass**
|
||||||
|
- **myvideo**
|
||||||
|
- **MyVidster**
|
||||||
|
- **n-tv.de**
|
||||||
|
- **NationalGeographic**
|
||||||
|
- **Naver**
|
||||||
|
- **NBA**
|
||||||
|
- **NBC**
|
||||||
|
- **NBCNews**
|
||||||
|
- **ndr**: NDR.de - Mediathek
|
||||||
|
- **NDTV**
|
||||||
|
- **NerdCubedFeed**
|
||||||
|
- **Nerdist**
|
||||||
|
- **Netzkino**
|
||||||
|
- **Newgrounds**
|
||||||
|
- **Newstube**
|
||||||
|
- **NextMedia**
|
||||||
|
- **NextMediaActionNews**
|
||||||
|
- **nfb**: National Film Board of Canada
|
||||||
|
- **nfl.com**
|
||||||
|
- **nhl.com**
|
||||||
|
- **nhl.com:news**: NHL news
|
||||||
|
- **nhl.com:videocenter**: NHL videocenter category
|
||||||
|
- **niconico**: ニコニコ動画
|
||||||
|
- **NiconicoPlaylist**
|
||||||
|
- **Noco**
|
||||||
|
- **Normalboots**
|
||||||
|
- **NosVideo**
|
||||||
|
- **novamov**: NovaMov
|
||||||
|
- **Nowness**
|
||||||
|
- **nowvideo**: NowVideo
|
||||||
|
- **npo.nl**
|
||||||
|
- **npo.nl:live**
|
||||||
|
- **npo.nl:radio**
|
||||||
|
- **npo.nl:radio:fragment**
|
||||||
|
- **NRK**
|
||||||
|
- **NRKTV**
|
||||||
|
- **ntv.ru**
|
||||||
|
- **Nuvid**
|
||||||
|
- **NYTimes**
|
||||||
|
- **ocw.mit.edu**
|
||||||
|
- **OktoberfestTV**
|
||||||
|
- **on.aol.com**
|
||||||
|
- **Ooyala**
|
||||||
|
- **OpenFilm**
|
||||||
|
- **orf:fm4**: radio FM4
|
||||||
|
- **orf:oe1**: Radio Österreich 1
|
||||||
|
- **orf:tvthek**: ORF TVthek
|
||||||
|
- **parliamentlive.tv**: UK parliament videos
|
||||||
|
- **Patreon**
|
||||||
|
- **PBS**
|
||||||
|
- **Phoenix**
|
||||||
|
- **Photobucket**
|
||||||
|
- **PlanetaPlay**
|
||||||
|
- **play.fm**
|
||||||
|
- **played.to**
|
||||||
|
- **Playvid**
|
||||||
|
- **plus.google**: Google Plus
|
||||||
|
- **pluzz.francetv.fr**
|
||||||
|
- **podomatic**
|
||||||
|
- **PornHd**
|
||||||
|
- **PornHub**
|
||||||
|
- **PornHubPlaylist**
|
||||||
|
- **Pornotube**
|
||||||
|
- **PornoXO**
|
||||||
|
- **PromptFile**
|
||||||
|
- **prosiebensat1**: ProSiebenSat.1 Digital
|
||||||
|
- **Pyvideo**
|
||||||
|
- **QuickVid**
|
||||||
|
- **radio.de**
|
||||||
|
- **radiobremen**
|
||||||
|
- **radiofrance**
|
||||||
|
- **Rai**
|
||||||
|
- **RBMARadio**
|
||||||
|
- **RedTube**
|
||||||
|
- **Restudy**
|
||||||
|
- **ReverbNation**
|
||||||
|
- **RingTV**
|
||||||
|
- **RottenTomatoes**
|
||||||
|
- **Roxwel**
|
||||||
|
- **RTBF**
|
||||||
|
- **Rte**
|
||||||
|
- **rtl.nl**: rtl.nl and rtlxl.nl
|
||||||
|
- **RTL2**
|
||||||
|
- **RTLnow**
|
||||||
|
- **RTP**
|
||||||
|
- **RTS**: RTS.ch
|
||||||
|
- **rtve.es:alacarta**: RTVE a la carta
|
||||||
|
- **rtve.es:live**: RTVE.es live streams
|
||||||
|
- **RUHD**
|
||||||
|
- **rutube**: Rutube videos
|
||||||
|
- **rutube:channel**: Rutube channels
|
||||||
|
- **rutube:embed**: Rutube embedded videos
|
||||||
|
- **rutube:movie**: Rutube movies
|
||||||
|
- **rutube:person**: Rutube person videos
|
||||||
|
- **RUTV**: RUTV.RU
|
||||||
|
- **Sandia**: Sandia National Laboratories
|
||||||
|
- **Sapo**: SAPO Vídeos
|
||||||
|
- **savefrom.net**
|
||||||
|
- **SBS**: sbs.com.au
|
||||||
|
- **SciVee**
|
||||||
|
- **screen.yahoo:search**: Yahoo screen search
|
||||||
|
- **Screencast**
|
||||||
|
- **ScreencastOMatic**
|
||||||
|
- **ScreenwaveMedia**
|
||||||
|
- **ServingSys**
|
||||||
|
- **Sexu**
|
||||||
|
- **SexyKarma**: Sexy Karma and Watch Indian Porn
|
||||||
|
- **Shared**
|
||||||
|
- **ShareSix**
|
||||||
|
- **Sina**
|
||||||
|
- **Slideshare**
|
||||||
|
- **Slutload**
|
||||||
|
- **smotri**: Smotri.com
|
||||||
|
- **smotri:broadcast**: Smotri.com broadcasts
|
||||||
|
- **smotri:community**: Smotri.com community videos
|
||||||
|
- **smotri:user**: Smotri.com user videos
|
||||||
|
- **Snotr**
|
||||||
|
- **Sockshare**
|
||||||
|
- **Sohu**
|
||||||
|
- **soundcloud**
|
||||||
|
- **soundcloud:playlist**
|
||||||
|
- **soundcloud:set**
|
||||||
|
- **soundcloud:user**
|
||||||
|
- **Soundgasm**
|
||||||
|
- **southpark.cc.com**
|
||||||
|
- **southpark.de**
|
||||||
|
- **Space**
|
||||||
|
- **Spankwire**
|
||||||
|
- **Spiegel**
|
||||||
|
- **Spiegel:Article**: Articles on spiegel.de
|
||||||
|
- **Spiegeltv**
|
||||||
|
- **Spike**
|
||||||
|
- **Sport5**
|
||||||
|
- **SportBox**
|
||||||
|
- **SportDeutschland**
|
||||||
|
- **SRMediathek**: Saarländischer Rundfunk
|
||||||
|
- **stanfordoc**: Stanford Open ClassRoom
|
||||||
|
- **Steam**
|
||||||
|
- **streamcloud.eu**
|
||||||
|
- **StreamCZ**
|
||||||
|
- **StreetVoice**
|
||||||
|
- **SunPorno**
|
||||||
|
- **SVTPlay**
|
||||||
|
- **SWRMediathek**
|
||||||
|
- **Syfy**
|
||||||
|
- **SztvHu**
|
||||||
|
- **Tagesschau**
|
||||||
|
- **Tapely**
|
||||||
|
- **Tass**
|
||||||
|
- **teachertube**: teachertube.com videos
|
||||||
|
- **teachertube:user:collection**: teachertube.com user and collection videos
|
||||||
|
- **TeachingChannel**
|
||||||
|
- **Teamcoco**
|
||||||
|
- **TeamFour**
|
||||||
|
- **TechTalks**
|
||||||
|
- **techtv.mit.edu**
|
||||||
|
- **TED**
|
||||||
|
- **tegenlicht.vpro.nl**
|
||||||
|
- **TeleBruxelles**
|
||||||
|
- **telecinco.es**
|
||||||
|
- **TeleMB**
|
||||||
|
- **TeleTask**
|
||||||
|
- **TenPlay**
|
||||||
|
- **TestTube**
|
||||||
|
- **TF1**
|
||||||
|
- **TheOnion**
|
||||||
|
- **ThePlatform**
|
||||||
|
- **TheSixtyOne**
|
||||||
|
- **ThisAV**
|
||||||
|
- **THVideo**
|
||||||
|
- **THVideoPlaylist**
|
||||||
|
- **tinypic**: tinypic.com videos
|
||||||
|
- **tlc.com**
|
||||||
|
- **tlc.de**
|
||||||
|
- **TMZ**
|
||||||
|
- **TNAFlix**
|
||||||
|
- **tou.tv**
|
||||||
|
- **Toypics**: Toypics user profile
|
||||||
|
- **ToypicsUser**: Toypics user profile
|
||||||
|
- **TrailerAddict** (Currently broken)
|
||||||
|
- **Trilulilu**
|
||||||
|
- **TruTube**
|
||||||
|
- **Tube8**
|
||||||
|
- **Tudou**
|
||||||
|
- **Tumblr**
|
||||||
|
- **TuneIn**
|
||||||
|
- **Turbo**
|
||||||
|
- **Tutv**
|
||||||
|
- **tv.dfb.de**
|
||||||
|
- **TV4**: tv4.se and tv4play.se
|
||||||
|
- **tvigle**: Интернет-телевидение Tvigle.ru
|
||||||
|
- **tvp.pl**
|
||||||
|
- **tvp.pl:Series**
|
||||||
|
- **TVPlay**: TV3Play and related services
|
||||||
|
- **Tweakers**
|
||||||
|
- **twitch:bookmarks**
|
||||||
|
- **twitch:chapter**
|
||||||
|
- **twitch:past_broadcasts**
|
||||||
|
- **twitch:profile**
|
||||||
|
- **twitch:stream**
|
||||||
|
- **twitch:video**
|
||||||
|
- **twitch:vod**
|
||||||
|
- **Ubu**
|
||||||
|
- **udemy**
|
||||||
|
- **udemy:course**
|
||||||
|
- **Unistra**
|
||||||
|
- **Urort**: NRK P3 Urørt
|
||||||
|
- **ustream**
|
||||||
|
- **ustream:channel**
|
||||||
|
- **Vbox7**
|
||||||
|
- **VeeHD**
|
||||||
|
- **Veoh**
|
||||||
|
- **Vesti**: Вести.Ru
|
||||||
|
- **Vevo**
|
||||||
|
- **VGTV**
|
||||||
|
- **vh1.com**
|
||||||
|
- **Vice**
|
||||||
|
- **Viddler**
|
||||||
|
- **video.google:search**: Google Video search
|
||||||
|
- **video.mit.edu**
|
||||||
|
- **VideoBam**
|
||||||
|
- **VideoDetective**
|
||||||
|
- **videofy.me**
|
||||||
|
- **videolectures.net**
|
||||||
|
- **VideoMega**
|
||||||
|
- **VideoPremium**
|
||||||
|
- **VideoTt**: video.tt - Your True Tube
|
||||||
|
- **videoweed**: VideoWeed
|
||||||
|
- **Vidme**
|
||||||
|
- **Vidzi**
|
||||||
|
- **vier**
|
||||||
|
- **vier:videos**
|
||||||
|
- **viki**
|
||||||
|
- **vimeo**
|
||||||
|
- **vimeo:album**
|
||||||
|
- **vimeo:channel**
|
||||||
|
- **vimeo:group**
|
||||||
|
- **vimeo:likes**: Vimeo user likes
|
||||||
|
- **vimeo:review**: Review pages on vimeo
|
||||||
|
- **vimeo:user**
|
||||||
|
- **vimeo:watchlater**: Vimeo watch later list, "vimeowatchlater" keyword (requires authentication)
|
||||||
|
- **Vimple**: Vimple.ru
|
||||||
|
- **Vine**
|
||||||
|
- **vine:user**
|
||||||
|
- **vk.com**
|
||||||
|
- **vk.com:user-videos**: vk.com:All of a user's videos
|
||||||
|
- **Vodlocker**
|
||||||
|
- **Vporn**
|
||||||
|
- **VRT**
|
||||||
|
- **vube**: Vube.com
|
||||||
|
- **VuClip**
|
||||||
|
- **vulture.com**
|
||||||
|
- **Walla**
|
||||||
|
- **WashingtonPost**
|
||||||
|
- **wat.tv**
|
||||||
|
- **WayOfTheMaster**
|
||||||
|
- **WDR**
|
||||||
|
- **wdr:mobile**
|
||||||
|
- **WDRMaus**: Sendung mit der Maus
|
||||||
|
- **WebOfStories**
|
||||||
|
- **Weibo**
|
||||||
|
- **Wimp**
|
||||||
|
- **Wistia**
|
||||||
|
- **WorldStarHipHop**
|
||||||
|
- **wrzuta.pl**
|
||||||
|
- **WSJ**: Wall Street Journal
|
||||||
|
- **XBef**
|
||||||
|
- **XboxClips**
|
||||||
|
- **XHamster**
|
||||||
|
- **XMinus**
|
||||||
|
- **XNXX**
|
||||||
|
- **XTube**
|
||||||
|
- **XTubeUser**: XTube user profile
|
||||||
|
- **Xuite**
|
||||||
|
- **XVideos**
|
||||||
|
- **XXXYMovies**
|
||||||
|
- **Yahoo**: Yahoo screen and movies
|
||||||
|
- **Yam**
|
||||||
|
- **YesJapan**
|
||||||
|
- **Ynet**
|
||||||
|
- **YouJizz**
|
||||||
|
- **Youku**
|
||||||
|
- **YouPorn**
|
||||||
|
- **YourUpload**
|
||||||
|
- **youtube**: YouTube.com
|
||||||
|
- **youtube:channel**: YouTube.com channels
|
||||||
|
- **youtube:favorites**: YouTube.com favourite videos, ":ytfav" for short (requires authentication)
|
||||||
|
- **youtube:history**: Youtube watch history, ":ythistory" for short (requires authentication)
|
||||||
|
- **youtube:playlist**: YouTube.com playlists
|
||||||
|
- **youtube:recommended**: YouTube.com recommended videos, ":ytrec" for short (requires authentication)
|
||||||
|
- **youtube:search**: YouTube.com searches
|
||||||
|
- **youtube:search:date**: YouTube.com searches, newest videos first
|
||||||
|
- **youtube:search_url**: YouTube.com search URLs
|
||||||
|
- **youtube:show**: YouTube.com (multi-season) shows
|
||||||
|
- **youtube:subscriptions**: YouTube.com subscriptions feed, "ytsubs" keyword (requires authentication)
|
||||||
|
- **youtube:user**: YouTube.com user videos (URL or "ytuser" keyword)
|
||||||
|
- **youtube:watch_later**: Youtube watch later list, ":ytwatchlater" for short (requires authentication)
|
||||||
|
- **ZDF**
|
||||||
|
- **ZDFChannel**
|
||||||
|
- **zingmp3:album**: mp3.zing.vn albums
|
||||||
|
- **zingmp3:song**: mp3.zing.vn songs
|
@ -1,2 +1,6 @@
|
|||||||
[wheel]
|
[wheel]
|
||||||
universal = True
|
universal = True
|
||||||
|
|
||||||
|
[flake8]
|
||||||
|
exclude = youtube_dl/extractor/__init__.py,devscripts/buildserver.py,setup.py,build,.git
|
||||||
|
ignore = E402,E501,E731
|
||||||
|
@ -82,24 +82,14 @@ class FakeYDL(YoutubeDL):
|
|||||||
|
|
||||||
def gettestcases(include_onlymatching=False):
|
def gettestcases(include_onlymatching=False):
|
||||||
for ie in youtube_dl.extractor.gen_extractors():
|
for ie in youtube_dl.extractor.gen_extractors():
|
||||||
t = getattr(ie, '_TEST', None)
|
for tc in ie.get_testcases(include_onlymatching):
|
||||||
if t:
|
yield tc
|
||||||
assert not hasattr(ie, '_TESTS'), \
|
|
||||||
'%s has _TEST and _TESTS' % type(ie).__name__
|
|
||||||
tests = [t]
|
|
||||||
else:
|
|
||||||
tests = getattr(ie, '_TESTS', [])
|
|
||||||
for t in tests:
|
|
||||||
if not include_onlymatching and t.get('only_matching', False):
|
|
||||||
continue
|
|
||||||
t['name'] = type(ie).__name__[:-len('IE')]
|
|
||||||
yield t
|
|
||||||
|
|
||||||
|
|
||||||
md5 = lambda s: hashlib.md5(s.encode('utf-8')).hexdigest()
|
md5 = lambda s: hashlib.md5(s.encode('utf-8')).hexdigest()
|
||||||
|
|
||||||
|
|
||||||
def expect_info_dict(self, expected_dict, got_dict):
|
def expect_info_dict(self, got_dict, expected_dict):
|
||||||
for info_field, expected in expected_dict.items():
|
for info_field, expected in expected_dict.items():
|
||||||
if isinstance(expected, compat_str) and expected.startswith('re:'):
|
if isinstance(expected, compat_str) and expected.startswith('re:'):
|
||||||
got = got_dict.get(info_field)
|
got = got_dict.get(info_field)
|
||||||
@ -113,6 +103,26 @@ def expect_info_dict(self, expected_dict, got_dict):
|
|||||||
self.assertTrue(
|
self.assertTrue(
|
||||||
match_rex.match(got),
|
match_rex.match(got),
|
||||||
'field %s (value: %r) should match %r' % (info_field, got, match_str))
|
'field %s (value: %r) should match %r' % (info_field, got, match_str))
|
||||||
|
elif isinstance(expected, compat_str) and expected.startswith('startswith:'):
|
||||||
|
got = got_dict.get(info_field)
|
||||||
|
start_str = expected[len('startswith:'):]
|
||||||
|
self.assertTrue(
|
||||||
|
isinstance(got, compat_str),
|
||||||
|
'Expected a %s object, but got %s for field %s' % (
|
||||||
|
compat_str.__name__, type(got).__name__, info_field))
|
||||||
|
self.assertTrue(
|
||||||
|
got.startswith(start_str),
|
||||||
|
'field %s (value: %r) should start with %r' % (info_field, got, start_str))
|
||||||
|
elif isinstance(expected, compat_str) and expected.startswith('contains:'):
|
||||||
|
got = got_dict.get(info_field)
|
||||||
|
contains_str = expected[len('contains:'):]
|
||||||
|
self.assertTrue(
|
||||||
|
isinstance(got, compat_str),
|
||||||
|
'Expected a %s object, but got %s for field %s' % (
|
||||||
|
compat_str.__name__, type(got).__name__, info_field))
|
||||||
|
self.assertTrue(
|
||||||
|
contains_str in got,
|
||||||
|
'field %s (value: %r) should contain %r' % (info_field, got, contains_str))
|
||||||
elif isinstance(expected, type):
|
elif isinstance(expected, type):
|
||||||
got = got_dict.get(info_field)
|
got = got_dict.get(info_field)
|
||||||
self.assertTrue(isinstance(got, expected),
|
self.assertTrue(isinstance(got, expected),
|
||||||
@ -120,6 +130,20 @@ def expect_info_dict(self, expected_dict, got_dict):
|
|||||||
else:
|
else:
|
||||||
if isinstance(expected, compat_str) and expected.startswith('md5:'):
|
if isinstance(expected, compat_str) and expected.startswith('md5:'):
|
||||||
got = 'md5:' + md5(got_dict.get(info_field))
|
got = 'md5:' + md5(got_dict.get(info_field))
|
||||||
|
elif isinstance(expected, compat_str) and expected.startswith('mincount:'):
|
||||||
|
got = got_dict.get(info_field)
|
||||||
|
self.assertTrue(
|
||||||
|
isinstance(got, list),
|
||||||
|
'Expected field %s to be a list, but it is of type %s' % (
|
||||||
|
info_field, type(got).__name__))
|
||||||
|
expected_num = int(expected.partition(':')[2])
|
||||||
|
assertGreaterEqual(
|
||||||
|
self, len(got), expected_num,
|
||||||
|
'Expected %d items in field %s, but only got %d' % (
|
||||||
|
expected_num, info_field, len(got)
|
||||||
|
)
|
||||||
|
)
|
||||||
|
continue
|
||||||
else:
|
else:
|
||||||
got = got_dict.get(info_field)
|
got = got_dict.get(info_field)
|
||||||
self.assertEqual(expected, got,
|
self.assertEqual(expected, got,
|
||||||
@ -137,7 +161,7 @@ def expect_info_dict(self, expected_dict, got_dict):
|
|||||||
# Are checkable fields missing from the test case definition?
|
# Are checkable fields missing from the test case definition?
|
||||||
test_info_dict = dict((key, value if not isinstance(value, compat_str) or len(value) < 250 else 'md5:' + md5(value))
|
test_info_dict = dict((key, value if not isinstance(value, compat_str) or len(value) < 250 else 'md5:' + md5(value))
|
||||||
for key, value in got_dict.items()
|
for key, value in got_dict.items()
|
||||||
if value and key in ('title', 'description', 'uploader', 'upload_date', 'timestamp', 'uploader_id', 'location'))
|
if value and key in ('id', 'title', 'description', 'uploader', 'upload_date', 'timestamp', 'uploader_id', 'location'))
|
||||||
missing_keys = set(test_info_dict.keys()) - set(expected_dict.keys())
|
missing_keys = set(test_info_dict.keys()) - set(expected_dict.keys())
|
||||||
if missing_keys:
|
if missing_keys:
|
||||||
def _repr(v):
|
def _repr(v):
|
||||||
@ -145,11 +169,19 @@ def expect_info_dict(self, expected_dict, got_dict):
|
|||||||
return "'%s'" % v.replace('\\', '\\\\').replace("'", "\\'").replace('\n', '\\n')
|
return "'%s'" % v.replace('\\', '\\\\').replace("'", "\\'").replace('\n', '\\n')
|
||||||
else:
|
else:
|
||||||
return repr(v)
|
return repr(v)
|
||||||
info_dict_str = ''.join(
|
info_dict_str = ''
|
||||||
' %s: %s,\n' % (_repr(k), _repr(v))
|
if len(missing_keys) != len(expected_dict):
|
||||||
for k, v in test_info_dict.items())
|
info_dict_str += ''.join(
|
||||||
|
' %s: %s,\n' % (_repr(k), _repr(v))
|
||||||
|
for k, v in test_info_dict.items() if k not in missing_keys)
|
||||||
|
|
||||||
|
if info_dict_str:
|
||||||
|
info_dict_str += '\n'
|
||||||
|
info_dict_str += ''.join(
|
||||||
|
' %s: %s,\n' % (_repr(k), _repr(test_info_dict[k]))
|
||||||
|
for k in missing_keys)
|
||||||
write_string(
|
write_string(
|
||||||
'\n\'info_dict\': {\n' + info_dict_str + '}\n', out=sys.stderr)
|
'\n\'info_dict\': {\n' + info_dict_str + '},\n', out=sys.stderr)
|
||||||
self.assertFalse(
|
self.assertFalse(
|
||||||
missing_keys,
|
missing_keys,
|
||||||
'Missing keys in test definition: %s' % (
|
'Missing keys in test definition: %s' % (
|
||||||
@ -162,7 +194,9 @@ def assertRegexpMatches(self, text, regexp, msg=None):
|
|||||||
else:
|
else:
|
||||||
m = re.match(regexp, text)
|
m = re.match(regexp, text)
|
||||||
if not m:
|
if not m:
|
||||||
note = 'Regexp didn\'t match: %r not found in %r' % (regexp, text)
|
note = 'Regexp didn\'t match: %r not found' % (regexp)
|
||||||
|
if len(text) < 1000:
|
||||||
|
note += ' in %r' % text
|
||||||
if msg is None:
|
if msg is None:
|
||||||
msg = note
|
msg = note
|
||||||
else:
|
else:
|
||||||
|
@ -39,5 +39,6 @@
|
|||||||
"writesubtitles": false,
|
"writesubtitles": false,
|
||||||
"allsubtitles": false,
|
"allsubtitles": false,
|
||||||
"listssubtitles": false,
|
"listssubtitles": false,
|
||||||
"socket_timeout": 20
|
"socket_timeout": 20,
|
||||||
|
"fixup": "never"
|
||||||
}
|
}
|
||||||
|
@ -40,5 +40,23 @@ class TestInfoExtractor(unittest.TestCase):
|
|||||||
self.assertEqual(ie._og_search_description(html), 'Some video\'s description ')
|
self.assertEqual(ie._og_search_description(html), 'Some video\'s description ')
|
||||||
self.assertEqual(ie._og_search_thumbnail(html), 'http://domain.com/pic.jpg?key1=val1&key2=val2')
|
self.assertEqual(ie._og_search_thumbnail(html), 'http://domain.com/pic.jpg?key1=val1&key2=val2')
|
||||||
|
|
||||||
|
def test_html_search_meta(self):
|
||||||
|
ie = self.ie
|
||||||
|
html = '''
|
||||||
|
<meta name="a" content="1" />
|
||||||
|
<meta name='b' content='2'>
|
||||||
|
<meta name="c" content='3'>
|
||||||
|
<meta name=d content='4'>
|
||||||
|
<meta property="e" content='5' >
|
||||||
|
<meta content="6" name="f">
|
||||||
|
'''
|
||||||
|
|
||||||
|
self.assertEqual(ie._html_search_meta('a', html), '1')
|
||||||
|
self.assertEqual(ie._html_search_meta('b', html), '2')
|
||||||
|
self.assertEqual(ie._html_search_meta('c', html), '3')
|
||||||
|
self.assertEqual(ie._html_search_meta('d', html), '4')
|
||||||
|
self.assertEqual(ie._html_search_meta('e', html), '5')
|
||||||
|
self.assertEqual(ie._html_search_meta('f', html), '6')
|
||||||
|
|
||||||
if __name__ == '__main__':
|
if __name__ == '__main__':
|
||||||
unittest.main()
|
unittest.main()
|
||||||
|
@ -8,9 +8,12 @@ import sys
|
|||||||
import unittest
|
import unittest
|
||||||
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
|
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
|
||||||
|
|
||||||
|
import copy
|
||||||
|
|
||||||
from test.helper import FakeYDL, assertRegexpMatches
|
from test.helper import FakeYDL, assertRegexpMatches
|
||||||
from youtube_dl import YoutubeDL
|
from youtube_dl import YoutubeDL
|
||||||
from youtube_dl.extractor import YoutubeIE
|
from youtube_dl.extractor import YoutubeIE
|
||||||
|
from youtube_dl.postprocessor.common import PostProcessor
|
||||||
|
|
||||||
|
|
||||||
class YDL(FakeYDL):
|
class YDL(FakeYDL):
|
||||||
@ -192,6 +195,37 @@ class TestFormatSelection(unittest.TestCase):
|
|||||||
downloaded = ydl.downloaded_info_dicts[0]
|
downloaded = ydl.downloaded_info_dicts[0]
|
||||||
self.assertEqual(downloaded['format_id'], 'vid-high')
|
self.assertEqual(downloaded['format_id'], 'vid-high')
|
||||||
|
|
||||||
|
def test_format_selection_audio_exts(self):
|
||||||
|
formats = [
|
||||||
|
{'format_id': 'mp3-64', 'ext': 'mp3', 'abr': 64, 'url': 'http://_', 'vcodec': 'none'},
|
||||||
|
{'format_id': 'ogg-64', 'ext': 'ogg', 'abr': 64, 'url': 'http://_', 'vcodec': 'none'},
|
||||||
|
{'format_id': 'aac-64', 'ext': 'aac', 'abr': 64, 'url': 'http://_', 'vcodec': 'none'},
|
||||||
|
{'format_id': 'mp3-32', 'ext': 'mp3', 'abr': 32, 'url': 'http://_', 'vcodec': 'none'},
|
||||||
|
{'format_id': 'aac-32', 'ext': 'aac', 'abr': 32, 'url': 'http://_', 'vcodec': 'none'},
|
||||||
|
]
|
||||||
|
|
||||||
|
info_dict = _make_result(formats)
|
||||||
|
ydl = YDL({'format': 'best'})
|
||||||
|
ie = YoutubeIE(ydl)
|
||||||
|
ie._sort_formats(info_dict['formats'])
|
||||||
|
ydl.process_ie_result(copy.deepcopy(info_dict))
|
||||||
|
downloaded = ydl.downloaded_info_dicts[0]
|
||||||
|
self.assertEqual(downloaded['format_id'], 'aac-64')
|
||||||
|
|
||||||
|
ydl = YDL({'format': 'mp3'})
|
||||||
|
ie = YoutubeIE(ydl)
|
||||||
|
ie._sort_formats(info_dict['formats'])
|
||||||
|
ydl.process_ie_result(copy.deepcopy(info_dict))
|
||||||
|
downloaded = ydl.downloaded_info_dicts[0]
|
||||||
|
self.assertEqual(downloaded['format_id'], 'mp3-64')
|
||||||
|
|
||||||
|
ydl = YDL({'prefer_free_formats': True})
|
||||||
|
ie = YoutubeIE(ydl)
|
||||||
|
ie._sort_formats(info_dict['formats'])
|
||||||
|
ydl.process_ie_result(copy.deepcopy(info_dict))
|
||||||
|
downloaded = ydl.downloaded_info_dicts[0]
|
||||||
|
self.assertEqual(downloaded['format_id'], 'ogg-64')
|
||||||
|
|
||||||
def test_format_selection_video(self):
|
def test_format_selection_video(self):
|
||||||
formats = [
|
formats = [
|
||||||
{'format_id': 'dash-video-low', 'ext': 'mp4', 'preference': 1, 'acodec': 'none', 'url': '_'},
|
{'format_id': 'dash-video-low', 'ext': 'mp4', 'preference': 1, 'acodec': 'none', 'url': '_'},
|
||||||
@ -218,7 +252,7 @@ class TestFormatSelection(unittest.TestCase):
|
|||||||
# 3D
|
# 3D
|
||||||
'85', '84', '102', '83', '101', '82', '100',
|
'85', '84', '102', '83', '101', '82', '100',
|
||||||
# Dash video
|
# Dash video
|
||||||
'138', '137', '248', '136', '247', '135', '246',
|
'137', '248', '136', '247', '135', '246',
|
||||||
'245', '244', '134', '243', '133', '242', '160',
|
'245', '244', '134', '243', '133', '242', '160',
|
||||||
# Dash audio
|
# Dash audio
|
||||||
'141', '172', '140', '171', '139',
|
'141', '172', '140', '171', '139',
|
||||||
@ -248,6 +282,61 @@ class TestFormatSelection(unittest.TestCase):
|
|||||||
downloaded = ydl.downloaded_info_dicts[0]
|
downloaded = ydl.downloaded_info_dicts[0]
|
||||||
self.assertEqual(downloaded['format_id'], f1id)
|
self.assertEqual(downloaded['format_id'], f1id)
|
||||||
|
|
||||||
|
def test_format_filtering(self):
|
||||||
|
formats = [
|
||||||
|
{'format_id': 'A', 'filesize': 500, 'width': 1000},
|
||||||
|
{'format_id': 'B', 'filesize': 1000, 'width': 500},
|
||||||
|
{'format_id': 'C', 'filesize': 1000, 'width': 400},
|
||||||
|
{'format_id': 'D', 'filesize': 2000, 'width': 600},
|
||||||
|
{'format_id': 'E', 'filesize': 3000},
|
||||||
|
{'format_id': 'F'},
|
||||||
|
{'format_id': 'G', 'filesize': 1000000},
|
||||||
|
]
|
||||||
|
for f in formats:
|
||||||
|
f['url'] = 'http://_/'
|
||||||
|
f['ext'] = 'unknown'
|
||||||
|
info_dict = _make_result(formats)
|
||||||
|
|
||||||
|
ydl = YDL({'format': 'best[filesize<3000]'})
|
||||||
|
ydl.process_ie_result(info_dict)
|
||||||
|
downloaded = ydl.downloaded_info_dicts[0]
|
||||||
|
self.assertEqual(downloaded['format_id'], 'D')
|
||||||
|
|
||||||
|
ydl = YDL({'format': 'best[filesize<=3000]'})
|
||||||
|
ydl.process_ie_result(info_dict)
|
||||||
|
downloaded = ydl.downloaded_info_dicts[0]
|
||||||
|
self.assertEqual(downloaded['format_id'], 'E')
|
||||||
|
|
||||||
|
ydl = YDL({'format': 'best[filesize <= ? 3000]'})
|
||||||
|
ydl.process_ie_result(info_dict)
|
||||||
|
downloaded = ydl.downloaded_info_dicts[0]
|
||||||
|
self.assertEqual(downloaded['format_id'], 'F')
|
||||||
|
|
||||||
|
ydl = YDL({'format': 'best [filesize = 1000] [width>450]'})
|
||||||
|
ydl.process_ie_result(info_dict)
|
||||||
|
downloaded = ydl.downloaded_info_dicts[0]
|
||||||
|
self.assertEqual(downloaded['format_id'], 'B')
|
||||||
|
|
||||||
|
ydl = YDL({'format': 'best [filesize = 1000] [width!=450]'})
|
||||||
|
ydl.process_ie_result(info_dict)
|
||||||
|
downloaded = ydl.downloaded_info_dicts[0]
|
||||||
|
self.assertEqual(downloaded['format_id'], 'C')
|
||||||
|
|
||||||
|
ydl = YDL({'format': '[filesize>?1]'})
|
||||||
|
ydl.process_ie_result(info_dict)
|
||||||
|
downloaded = ydl.downloaded_info_dicts[0]
|
||||||
|
self.assertEqual(downloaded['format_id'], 'G')
|
||||||
|
|
||||||
|
ydl = YDL({'format': '[filesize<1M]'})
|
||||||
|
ydl.process_ie_result(info_dict)
|
||||||
|
downloaded = ydl.downloaded_info_dicts[0]
|
||||||
|
self.assertEqual(downloaded['format_id'], 'E')
|
||||||
|
|
||||||
|
ydl = YDL({'format': '[filesize<1MiB]'})
|
||||||
|
ydl.process_ie_result(info_dict)
|
||||||
|
downloaded = ydl.downloaded_info_dicts[0]
|
||||||
|
self.assertEqual(downloaded['format_id'], 'G')
|
||||||
|
|
||||||
def test_add_extra_info(self):
|
def test_add_extra_info(self):
|
||||||
test_dict = {
|
test_dict = {
|
||||||
'extractor': 'Foo',
|
'extractor': 'Foo',
|
||||||
@ -282,5 +371,35 @@ class TestFormatSelection(unittest.TestCase):
|
|||||||
'vbr': 10,
|
'vbr': 10,
|
||||||
}), '^\s*10k$')
|
}), '^\s*10k$')
|
||||||
|
|
||||||
|
def test_postprocessors(self):
|
||||||
|
filename = 'post-processor-testfile.mp4'
|
||||||
|
audiofile = filename + '.mp3'
|
||||||
|
|
||||||
|
class SimplePP(PostProcessor):
|
||||||
|
def run(self, info):
|
||||||
|
with open(audiofile, 'wt') as f:
|
||||||
|
f.write('EXAMPLE')
|
||||||
|
info['filepath']
|
||||||
|
return False, info
|
||||||
|
|
||||||
|
def run_pp(params):
|
||||||
|
with open(filename, 'wt') as f:
|
||||||
|
f.write('EXAMPLE')
|
||||||
|
ydl = YoutubeDL(params)
|
||||||
|
ydl.add_post_processor(SimplePP())
|
||||||
|
ydl.post_process(filename, {'filepath': filename})
|
||||||
|
|
||||||
|
run_pp({'keepvideo': True})
|
||||||
|
self.assertTrue(os.path.exists(filename), '%s doesn\'t exist' % filename)
|
||||||
|
self.assertTrue(os.path.exists(audiofile), '%s doesn\'t exist' % audiofile)
|
||||||
|
os.unlink(filename)
|
||||||
|
os.unlink(audiofile)
|
||||||
|
|
||||||
|
run_pp({'keepvideo': False})
|
||||||
|
self.assertFalse(os.path.exists(filename), '%s exists' % filename)
|
||||||
|
self.assertTrue(os.path.exists(audiofile), '%s doesn\'t exist' % audiofile)
|
||||||
|
os.unlink(audiofile)
|
||||||
|
|
||||||
|
|
||||||
if __name__ == '__main__':
|
if __name__ == '__main__':
|
||||||
unittest.main()
|
unittest.main()
|
||||||
|
@ -45,11 +45,6 @@ class TestAgeRestriction(unittest.TestCase):
|
|||||||
'http://www.youporn.com/watch/505835/sex-ed-is-it-safe-to-masturbate-daily/',
|
'http://www.youporn.com/watch/505835/sex-ed-is-it-safe-to-masturbate-daily/',
|
||||||
'505835.mp4', 2, old_age=25)
|
'505835.mp4', 2, old_age=25)
|
||||||
|
|
||||||
def test_pornotube(self):
|
|
||||||
self._assert_restricted(
|
|
||||||
'http://pornotube.com/c/173/m/1689755/Marilyn-Monroe-Bathing',
|
|
||||||
'1689755.flv', 13)
|
|
||||||
|
|
||||||
|
|
||||||
if __name__ == '__main__':
|
if __name__ == '__main__':
|
||||||
unittest.main()
|
unittest.main()
|
||||||
|
@ -14,7 +14,6 @@ from test.helper import gettestcases
|
|||||||
from youtube_dl.extractor import (
|
from youtube_dl.extractor import (
|
||||||
FacebookIE,
|
FacebookIE,
|
||||||
gen_extractors,
|
gen_extractors,
|
||||||
TwitchIE,
|
|
||||||
YoutubeIE,
|
YoutubeIE,
|
||||||
)
|
)
|
||||||
|
|
||||||
@ -72,18 +71,6 @@ class TestAllURLsMatching(unittest.TestCase):
|
|||||||
self.assertMatch('http://www.youtube.com/results?search_query=making+mustard', ['youtube:search_url'])
|
self.assertMatch('http://www.youtube.com/results?search_query=making+mustard', ['youtube:search_url'])
|
||||||
self.assertMatch('https://www.youtube.com/results?baz=bar&search_query=youtube-dl+test+video&filters=video&lclk=video', ['youtube:search_url'])
|
self.assertMatch('https://www.youtube.com/results?baz=bar&search_query=youtube-dl+test+video&filters=video&lclk=video', ['youtube:search_url'])
|
||||||
|
|
||||||
def test_twitch_channelid_matching(self):
|
|
||||||
self.assertTrue(TwitchIE.suitable('twitch.tv/vanillatv'))
|
|
||||||
self.assertTrue(TwitchIE.suitable('www.twitch.tv/vanillatv'))
|
|
||||||
self.assertTrue(TwitchIE.suitable('http://www.twitch.tv/vanillatv'))
|
|
||||||
self.assertTrue(TwitchIE.suitable('http://www.twitch.tv/vanillatv/'))
|
|
||||||
|
|
||||||
def test_twitch_videoid_matching(self):
|
|
||||||
self.assertTrue(TwitchIE.suitable('http://www.twitch.tv/vanillatv/b/328087483'))
|
|
||||||
|
|
||||||
def test_twitch_chapterid_matching(self):
|
|
||||||
self.assertTrue(TwitchIE.suitable('http://www.twitch.tv/tsm_theoddone/c/2349361'))
|
|
||||||
|
|
||||||
def test_youtube_extract(self):
|
def test_youtube_extract(self):
|
||||||
assertExtractId = lambda url, id: self.assertEqual(YoutubeIE.extract_id(url), id)
|
assertExtractId = lambda url, id: self.assertEqual(YoutubeIE.extract_id(url), id)
|
||||||
assertExtractId('http://www.youtube.com/watch?&v=BaW_jenozKc', 'BaW_jenozKc')
|
assertExtractId('http://www.youtube.com/watch?&v=BaW_jenozKc', 'BaW_jenozKc')
|
||||||
@ -115,8 +102,6 @@ class TestAllURLsMatching(unittest.TestCase):
|
|||||||
self.assertMatch(':ythistory', ['youtube:history'])
|
self.assertMatch(':ythistory', ['youtube:history'])
|
||||||
self.assertMatch(':thedailyshow', ['ComedyCentralShows'])
|
self.assertMatch(':thedailyshow', ['ComedyCentralShows'])
|
||||||
self.assertMatch(':tds', ['ComedyCentralShows'])
|
self.assertMatch(':tds', ['ComedyCentralShows'])
|
||||||
self.assertMatch(':colbertreport', ['ComedyCentralShows'])
|
|
||||||
self.assertMatch(':cr', ['ComedyCentralShows'])
|
|
||||||
|
|
||||||
def test_vimeo_matching(self):
|
def test_vimeo_matching(self):
|
||||||
self.assertMatch('http://vimeo.com/channels/tributes', ['vimeo:channel'])
|
self.assertMatch('http://vimeo.com/channels/tributes', ['vimeo:channel'])
|
||||||
|
@ -89,7 +89,7 @@ def generator(test_case):
|
|||||||
|
|
||||||
for tc in test_cases:
|
for tc in test_cases:
|
||||||
info_dict = tc.get('info_dict', {})
|
info_dict = tc.get('info_dict', {})
|
||||||
if not tc.get('file') and not (info_dict.get('id') and info_dict.get('ext')):
|
if not (info_dict.get('id') and info_dict.get('ext')):
|
||||||
raise Exception('Test definition incorrect. The output file cannot be known. Are both \'id\' and \'ext\' keys present?')
|
raise Exception('Test definition incorrect. The output file cannot be known. Are both \'id\' and \'ext\' keys present?')
|
||||||
|
|
||||||
if 'skip' in test_case:
|
if 'skip' in test_case:
|
||||||
@ -116,7 +116,7 @@ def generator(test_case):
|
|||||||
expect_warnings(ydl, test_case.get('expected_warnings', []))
|
expect_warnings(ydl, test_case.get('expected_warnings', []))
|
||||||
|
|
||||||
def get_tc_filename(tc):
|
def get_tc_filename(tc):
|
||||||
return tc.get('file') or ydl.prepare_filename(tc.get('info_dict', {}))
|
return ydl.prepare_filename(tc.get('info_dict', {}))
|
||||||
|
|
||||||
res_dict = None
|
res_dict = None
|
||||||
|
|
||||||
@ -155,7 +155,7 @@ def generator(test_case):
|
|||||||
if is_playlist:
|
if is_playlist:
|
||||||
self.assertEqual(res_dict['_type'], 'playlist')
|
self.assertEqual(res_dict['_type'], 'playlist')
|
||||||
self.assertTrue('entries' in res_dict)
|
self.assertTrue('entries' in res_dict)
|
||||||
expect_info_dict(self, test_case.get('info_dict', {}), res_dict)
|
expect_info_dict(self, res_dict, test_case.get('info_dict', {}))
|
||||||
|
|
||||||
if 'playlist_mincount' in test_case:
|
if 'playlist_mincount' in test_case:
|
||||||
assertGreaterEqual(
|
assertGreaterEqual(
|
||||||
@ -204,7 +204,7 @@ def generator(test_case):
|
|||||||
with io.open(info_json_fn, encoding='utf-8') as infof:
|
with io.open(info_json_fn, encoding='utf-8') as infof:
|
||||||
info_dict = json.load(infof)
|
info_dict = json.load(infof)
|
||||||
|
|
||||||
expect_info_dict(self, tc.get('info_dict', {}), info_dict)
|
expect_info_dict(self, info_dict, tc.get('info_dict', {}))
|
||||||
finally:
|
finally:
|
||||||
try_rm_tcs_files()
|
try_rm_tcs_files()
|
||||||
if is_playlist and res_dict is not None and res_dict.get('entries'):
|
if is_playlist and res_dict is not None and res_dict.get('entries'):
|
||||||
|
72
test/test_http.py
Normal file
72
test/test_http.py
Normal file
@ -0,0 +1,72 @@
|
|||||||
|
#!/usr/bin/env python
|
||||||
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
# Allow direct execution
|
||||||
|
import os
|
||||||
|
import sys
|
||||||
|
import unittest
|
||||||
|
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
|
||||||
|
|
||||||
|
from youtube_dl import YoutubeDL
|
||||||
|
from youtube_dl.compat import compat_http_server
|
||||||
|
import ssl
|
||||||
|
import threading
|
||||||
|
|
||||||
|
TEST_DIR = os.path.dirname(os.path.abspath(__file__))
|
||||||
|
|
||||||
|
|
||||||
|
class HTTPTestRequestHandler(compat_http_server.BaseHTTPRequestHandler):
|
||||||
|
def log_message(self, format, *args):
|
||||||
|
pass
|
||||||
|
|
||||||
|
def do_GET(self):
|
||||||
|
if self.path == '/video.html':
|
||||||
|
self.send_response(200)
|
||||||
|
self.send_header('Content-Type', 'text/html; charset=utf-8')
|
||||||
|
self.end_headers()
|
||||||
|
self.wfile.write(b'<html><video src="/vid.mp4" /></html>')
|
||||||
|
elif self.path == '/vid.mp4':
|
||||||
|
self.send_response(200)
|
||||||
|
self.send_header('Content-Type', 'video/mp4')
|
||||||
|
self.end_headers()
|
||||||
|
self.wfile.write(b'\x00\x00\x00\x00\x20\x66\x74[video]')
|
||||||
|
else:
|
||||||
|
assert False
|
||||||
|
|
||||||
|
|
||||||
|
class FakeLogger(object):
|
||||||
|
def debug(self, msg):
|
||||||
|
pass
|
||||||
|
|
||||||
|
def warning(self, msg):
|
||||||
|
pass
|
||||||
|
|
||||||
|
def error(self, msg):
|
||||||
|
pass
|
||||||
|
|
||||||
|
|
||||||
|
class TestHTTP(unittest.TestCase):
|
||||||
|
def setUp(self):
|
||||||
|
certfn = os.path.join(TEST_DIR, 'testcert.pem')
|
||||||
|
self.httpd = compat_http_server.HTTPServer(
|
||||||
|
('localhost', 0), HTTPTestRequestHandler)
|
||||||
|
self.httpd.socket = ssl.wrap_socket(
|
||||||
|
self.httpd.socket, certfile=certfn, server_side=True)
|
||||||
|
self.port = self.httpd.socket.getsockname()[1]
|
||||||
|
self.server_thread = threading.Thread(target=self.httpd.serve_forever)
|
||||||
|
self.server_thread.daemon = True
|
||||||
|
self.server_thread.start()
|
||||||
|
|
||||||
|
def test_nocheckcertificate(self):
|
||||||
|
if sys.version_info >= (2, 7, 9): # No certificate checking anyways
|
||||||
|
ydl = YoutubeDL({'logger': FakeLogger()})
|
||||||
|
self.assertRaises(
|
||||||
|
Exception,
|
||||||
|
ydl.extract_info, 'https://localhost:%d/video.html' % self.port)
|
||||||
|
|
||||||
|
ydl = YoutubeDL({'logger': FakeLogger(), 'nocheckcertificate': True})
|
||||||
|
r = ydl.extract_info('https://localhost:%d/video.html' % self.port)
|
||||||
|
self.assertEqual(r['url'], 'https://localhost:%d/vid.mp4' % self.port)
|
||||||
|
|
||||||
|
if __name__ == '__main__':
|
||||||
|
unittest.main()
|
106
test/test_jsinterp.py
Normal file
106
test/test_jsinterp.py
Normal file
@ -0,0 +1,106 @@
|
|||||||
|
#!/usr/bin/env python
|
||||||
|
|
||||||
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
# Allow direct execution
|
||||||
|
import os
|
||||||
|
import sys
|
||||||
|
import unittest
|
||||||
|
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
|
||||||
|
|
||||||
|
from youtube_dl.jsinterp import JSInterpreter
|
||||||
|
|
||||||
|
|
||||||
|
class TestJSInterpreter(unittest.TestCase):
|
||||||
|
def test_basic(self):
|
||||||
|
jsi = JSInterpreter('function x(){;}')
|
||||||
|
self.assertEqual(jsi.call_function('x'), None)
|
||||||
|
|
||||||
|
jsi = JSInterpreter('function x3(){return 42;}')
|
||||||
|
self.assertEqual(jsi.call_function('x3'), 42)
|
||||||
|
|
||||||
|
def test_calc(self):
|
||||||
|
jsi = JSInterpreter('function x4(a){return 2*a+1;}')
|
||||||
|
self.assertEqual(jsi.call_function('x4', 3), 7)
|
||||||
|
|
||||||
|
def test_empty_return(self):
|
||||||
|
jsi = JSInterpreter('function f(){return; y()}')
|
||||||
|
self.assertEqual(jsi.call_function('f'), None)
|
||||||
|
|
||||||
|
def test_morespace(self):
|
||||||
|
jsi = JSInterpreter('function x (a) { return 2 * a + 1 ; }')
|
||||||
|
self.assertEqual(jsi.call_function('x', 3), 7)
|
||||||
|
|
||||||
|
jsi = JSInterpreter('function f () { x = 2 ; return x; }')
|
||||||
|
self.assertEqual(jsi.call_function('f'), 2)
|
||||||
|
|
||||||
|
def test_strange_chars(self):
|
||||||
|
jsi = JSInterpreter('function $_xY1 ($_axY1) { var $_axY2 = $_axY1 + 1; return $_axY2; }')
|
||||||
|
self.assertEqual(jsi.call_function('$_xY1', 20), 21)
|
||||||
|
|
||||||
|
def test_operators(self):
|
||||||
|
jsi = JSInterpreter('function f(){return 1 << 5;}')
|
||||||
|
self.assertEqual(jsi.call_function('f'), 32)
|
||||||
|
|
||||||
|
jsi = JSInterpreter('function f(){return 19 & 21;}')
|
||||||
|
self.assertEqual(jsi.call_function('f'), 17)
|
||||||
|
|
||||||
|
jsi = JSInterpreter('function f(){return 11 >> 2;}')
|
||||||
|
self.assertEqual(jsi.call_function('f'), 2)
|
||||||
|
|
||||||
|
def test_array_access(self):
|
||||||
|
jsi = JSInterpreter('function f(){var x = [1,2,3]; x[0] = 4; x[0] = 5; x[2] = 7; return x;}')
|
||||||
|
self.assertEqual(jsi.call_function('f'), [5, 2, 7])
|
||||||
|
|
||||||
|
def test_parens(self):
|
||||||
|
jsi = JSInterpreter('function f(){return (1) + (2) * ((( (( (((((3)))))) )) ));}')
|
||||||
|
self.assertEqual(jsi.call_function('f'), 7)
|
||||||
|
|
||||||
|
jsi = JSInterpreter('function f(){return (1 + 2) * 3;}')
|
||||||
|
self.assertEqual(jsi.call_function('f'), 9)
|
||||||
|
|
||||||
|
def test_assignments(self):
|
||||||
|
jsi = JSInterpreter('function f(){var x = 20; x = 30 + 1; return x;}')
|
||||||
|
self.assertEqual(jsi.call_function('f'), 31)
|
||||||
|
|
||||||
|
jsi = JSInterpreter('function f(){var x = 20; x += 30 + 1; return x;}')
|
||||||
|
self.assertEqual(jsi.call_function('f'), 51)
|
||||||
|
|
||||||
|
jsi = JSInterpreter('function f(){var x = 20; x -= 30 + 1; return x;}')
|
||||||
|
self.assertEqual(jsi.call_function('f'), -11)
|
||||||
|
|
||||||
|
def test_comments(self):
|
||||||
|
'Skipping: Not yet fully implemented'
|
||||||
|
return
|
||||||
|
jsi = JSInterpreter('''
|
||||||
|
function x() {
|
||||||
|
var x = /* 1 + */ 2;
|
||||||
|
var y = /* 30
|
||||||
|
* 40 */ 50;
|
||||||
|
return x + y;
|
||||||
|
}
|
||||||
|
''')
|
||||||
|
self.assertEqual(jsi.call_function('x'), 52)
|
||||||
|
|
||||||
|
jsi = JSInterpreter('''
|
||||||
|
function f() {
|
||||||
|
var x = "/*";
|
||||||
|
var y = 1 /* comment */ + 2;
|
||||||
|
return y;
|
||||||
|
}
|
||||||
|
''')
|
||||||
|
self.assertEqual(jsi.call_function('f'), 3)
|
||||||
|
|
||||||
|
def test_precedence(self):
|
||||||
|
jsi = JSInterpreter('''
|
||||||
|
function x() {
|
||||||
|
var a = [10, 20, 30, 40, 50];
|
||||||
|
var b = 6;
|
||||||
|
a[0]=a[b%a.length];
|
||||||
|
return a;
|
||||||
|
}''')
|
||||||
|
self.assertEqual(jsi.call_function('x'), [20, 20, 30, 40, 50])
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == '__main__':
|
||||||
|
unittest.main()
|
@ -17,6 +17,7 @@ from youtube_dl.extractor import (
|
|||||||
TEDIE,
|
TEDIE,
|
||||||
VimeoIE,
|
VimeoIE,
|
||||||
WallaIE,
|
WallaIE,
|
||||||
|
CeskaTelevizeIE,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
@ -88,6 +89,14 @@ class TestYoutubeSubtitles(BaseTestSubtitles):
|
|||||||
subtitles = self.getSubtitles()
|
subtitles = self.getSubtitles()
|
||||||
self.assertTrue(subtitles['it'] is not None)
|
self.assertTrue(subtitles['it'] is not None)
|
||||||
|
|
||||||
|
def test_youtube_translated_subtitles(self):
|
||||||
|
# This video has a subtitles track, which can be translated
|
||||||
|
self.url = 'Ky9eprVWzlI'
|
||||||
|
self.DL.params['writeautomaticsub'] = True
|
||||||
|
self.DL.params['subtitleslangs'] = ['it']
|
||||||
|
subtitles = self.getSubtitles()
|
||||||
|
self.assertTrue(subtitles['it'] is not None)
|
||||||
|
|
||||||
def test_youtube_nosubtitles(self):
|
def test_youtube_nosubtitles(self):
|
||||||
self.DL.expect_warning('video doesn\'t have subtitles')
|
self.DL.expect_warning('video doesn\'t have subtitles')
|
||||||
self.url = 'n5BB19UTcdA'
|
self.url = 'n5BB19UTcdA'
|
||||||
@ -129,7 +138,7 @@ class TestDailymotionSubtitles(BaseTestSubtitles):
|
|||||||
self.DL.params['writesubtitles'] = True
|
self.DL.params['writesubtitles'] = True
|
||||||
self.DL.params['allsubtitles'] = True
|
self.DL.params['allsubtitles'] = True
|
||||||
subtitles = self.getSubtitles()
|
subtitles = self.getSubtitles()
|
||||||
self.assertEqual(len(subtitles.keys()), 5)
|
self.assertTrue(len(subtitles.keys()) >= 6)
|
||||||
|
|
||||||
def test_list_subtitles(self):
|
def test_list_subtitles(self):
|
||||||
self.DL.expect_warning('Automatic Captions not supported by this server')
|
self.DL.expect_warning('Automatic Captions not supported by this server')
|
||||||
@ -309,5 +318,32 @@ class TestWallaSubtitles(BaseTestSubtitles):
|
|||||||
self.assertEqual(len(subtitles), 0)
|
self.assertEqual(len(subtitles), 0)
|
||||||
|
|
||||||
|
|
||||||
|
class TestCeskaTelevizeSubtitles(BaseTestSubtitles):
|
||||||
|
url = 'http://www.ceskatelevize.cz/ivysilani/10600540290-u6-uzasny-svet-techniky'
|
||||||
|
IE = CeskaTelevizeIE
|
||||||
|
|
||||||
|
def test_list_subtitles(self):
|
||||||
|
self.DL.expect_warning('Automatic Captions not supported by this server')
|
||||||
|
self.DL.params['listsubtitles'] = True
|
||||||
|
info_dict = self.getInfoDict()
|
||||||
|
self.assertEqual(info_dict, None)
|
||||||
|
|
||||||
|
def test_allsubtitles(self):
|
||||||
|
self.DL.expect_warning('Automatic Captions not supported by this server')
|
||||||
|
self.DL.params['writesubtitles'] = True
|
||||||
|
self.DL.params['allsubtitles'] = True
|
||||||
|
subtitles = self.getSubtitles()
|
||||||
|
self.assertEqual(set(subtitles.keys()), set(['cs']))
|
||||||
|
self.assertTrue(len(subtitles['cs']) > 20000)
|
||||||
|
|
||||||
|
def test_nosubtitles(self):
|
||||||
|
self.DL.expect_warning('video doesn\'t have subtitles')
|
||||||
|
self.url = 'http://www.ceskatelevize.cz/ivysilani/ivysilani/10441294653-hyde-park-civilizace/214411058091220'
|
||||||
|
self.DL.params['writesubtitles'] = True
|
||||||
|
self.DL.params['allsubtitles'] = True
|
||||||
|
subtitles = self.getSubtitles()
|
||||||
|
self.assertEqual(len(subtitles), 0)
|
||||||
|
|
||||||
|
|
||||||
if __name__ == '__main__':
|
if __name__ == '__main__':
|
||||||
unittest.main()
|
unittest.main()
|
||||||
|
@ -1,9 +1,13 @@
|
|||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
import io
|
# Allow direct execution
|
||||||
import os
|
import os
|
||||||
import re
|
import sys
|
||||||
import unittest
|
import unittest
|
||||||
|
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
|
||||||
|
|
||||||
|
import io
|
||||||
|
import re
|
||||||
|
|
||||||
rootDir = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
|
rootDir = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
|
||||||
|
|
||||||
@ -14,6 +18,9 @@ IGNORED_FILES = [
|
|||||||
]
|
]
|
||||||
|
|
||||||
|
|
||||||
|
from test.helper import assertRegexpMatches
|
||||||
|
|
||||||
|
|
||||||
class TestUnicodeLiterals(unittest.TestCase):
|
class TestUnicodeLiterals(unittest.TestCase):
|
||||||
def test_all_files(self):
|
def test_all_files(self):
|
||||||
for dirpath, _, filenames in os.walk(rootDir):
|
for dirpath, _, filenames in os.walk(rootDir):
|
||||||
@ -29,9 +36,10 @@ class TestUnicodeLiterals(unittest.TestCase):
|
|||||||
|
|
||||||
if "'" not in code and '"' not in code:
|
if "'" not in code and '"' not in code:
|
||||||
continue
|
continue
|
||||||
self.assertRegexpMatches(
|
assertRegexpMatches(
|
||||||
|
self,
|
||||||
code,
|
code,
|
||||||
r'(?:#.*\n*)?from __future__ import (?:[a-z_]+,\s*)*unicode_literals',
|
r'(?:(?:#.*?|\s*)\n)*from __future__ import (?:[a-z_]+,\s*)*unicode_literals',
|
||||||
'unicode_literals import missing in %s' % fn)
|
'unicode_literals import missing in %s' % fn)
|
||||||
|
|
||||||
m = re.search(r'(?<=\s)u[\'"](?!\)|,|$)', code)
|
m = re.search(r'(?<=\s)u[\'"](?!\)|,|$)', code)
|
||||||
|
@ -16,38 +16,44 @@ import json
|
|||||||
import xml.etree.ElementTree
|
import xml.etree.ElementTree
|
||||||
|
|
||||||
from youtube_dl.utils import (
|
from youtube_dl.utils import (
|
||||||
|
age_restricted,
|
||||||
|
args_to_str,
|
||||||
clean_html,
|
clean_html,
|
||||||
DateRange,
|
DateRange,
|
||||||
|
detect_exe_version,
|
||||||
encodeFilename,
|
encodeFilename,
|
||||||
|
escape_rfc3986,
|
||||||
|
escape_url,
|
||||||
find_xpath_attr,
|
find_xpath_attr,
|
||||||
fix_xml_ampersands,
|
fix_xml_ampersands,
|
||||||
orderedSet,
|
|
||||||
OnDemandPagedList,
|
|
||||||
InAdvancePagedList,
|
InAdvancePagedList,
|
||||||
|
intlist_to_bytes,
|
||||||
|
is_html,
|
||||||
|
js_to_json,
|
||||||
|
limit_length,
|
||||||
|
OnDemandPagedList,
|
||||||
|
orderedSet,
|
||||||
parse_duration,
|
parse_duration,
|
||||||
|
parse_filesize,
|
||||||
|
parse_iso8601,
|
||||||
read_batch_urls,
|
read_batch_urls,
|
||||||
sanitize_filename,
|
sanitize_filename,
|
||||||
shell_quote,
|
shell_quote,
|
||||||
smuggle_url,
|
smuggle_url,
|
||||||
str_to_int,
|
str_to_int,
|
||||||
|
strip_jsonp,
|
||||||
struct_unpack,
|
struct_unpack,
|
||||||
timeconvert,
|
timeconvert,
|
||||||
unescapeHTML,
|
unescapeHTML,
|
||||||
unified_strdate,
|
unified_strdate,
|
||||||
unsmuggle_url,
|
unsmuggle_url,
|
||||||
|
uppercase_escape,
|
||||||
url_basename,
|
url_basename,
|
||||||
urlencode_postdata,
|
urlencode_postdata,
|
||||||
|
version_tuple,
|
||||||
xpath_with_ns,
|
xpath_with_ns,
|
||||||
parse_iso8601,
|
render_table,
|
||||||
strip_jsonp,
|
match_str,
|
||||||
uppercase_escape,
|
|
||||||
limit_length,
|
|
||||||
escape_rfc3986,
|
|
||||||
escape_url,
|
|
||||||
js_to_json,
|
|
||||||
intlist_to_bytes,
|
|
||||||
args_to_str,
|
|
||||||
parse_filesize,
|
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
@ -76,6 +82,10 @@ class TestUtil(unittest.TestCase):
|
|||||||
tests = '\u043a\u0438\u0440\u0438\u043b\u043b\u0438\u0446\u0430'
|
tests = '\u043a\u0438\u0440\u0438\u043b\u043b\u0438\u0446\u0430'
|
||||||
self.assertEqual(sanitize_filename(tests), tests)
|
self.assertEqual(sanitize_filename(tests), tests)
|
||||||
|
|
||||||
|
self.assertEqual(
|
||||||
|
sanitize_filename('New World record at 0:12:34'),
|
||||||
|
'New World record at 0_12_34')
|
||||||
|
|
||||||
forbidden = '"\0\\/'
|
forbidden = '"\0\\/'
|
||||||
for fc in forbidden:
|
for fc in forbidden:
|
||||||
for fbc in forbidden:
|
for fbc in forbidden:
|
||||||
@ -141,8 +151,15 @@ class TestUtil(unittest.TestCase):
|
|||||||
self.assertEqual(unified_strdate('8/7/2009'), '20090708')
|
self.assertEqual(unified_strdate('8/7/2009'), '20090708')
|
||||||
self.assertEqual(unified_strdate('Dec 14, 2012'), '20121214')
|
self.assertEqual(unified_strdate('Dec 14, 2012'), '20121214')
|
||||||
self.assertEqual(unified_strdate('2012/10/11 01:56:38 +0000'), '20121011')
|
self.assertEqual(unified_strdate('2012/10/11 01:56:38 +0000'), '20121011')
|
||||||
|
self.assertEqual(unified_strdate('1968 12 10'), '19681210')
|
||||||
self.assertEqual(unified_strdate('1968-12-10'), '19681210')
|
self.assertEqual(unified_strdate('1968-12-10'), '19681210')
|
||||||
self.assertEqual(unified_strdate('28/01/2014 21:00:00 +0100'), '20140128')
|
self.assertEqual(unified_strdate('28/01/2014 21:00:00 +0100'), '20140128')
|
||||||
|
self.assertEqual(
|
||||||
|
unified_strdate('11/26/2014 11:30:00 AM PST', day_first=False),
|
||||||
|
'20141126')
|
||||||
|
self.assertEqual(
|
||||||
|
unified_strdate('2/2/2015 6:47:40 PM', day_first=False),
|
||||||
|
'20150202')
|
||||||
|
|
||||||
def test_find_xpath_attr(self):
|
def test_find_xpath_attr(self):
|
||||||
testxml = '''<root>
|
testxml = '''<root>
|
||||||
@ -202,6 +219,8 @@ class TestUtil(unittest.TestCase):
|
|||||||
|
|
||||||
def test_parse_duration(self):
|
def test_parse_duration(self):
|
||||||
self.assertEqual(parse_duration(None), None)
|
self.assertEqual(parse_duration(None), None)
|
||||||
|
self.assertEqual(parse_duration(False), None)
|
||||||
|
self.assertEqual(parse_duration('invalid'), None)
|
||||||
self.assertEqual(parse_duration('1'), 1)
|
self.assertEqual(parse_duration('1'), 1)
|
||||||
self.assertEqual(parse_duration('1337:12'), 80232)
|
self.assertEqual(parse_duration('1337:12'), 80232)
|
||||||
self.assertEqual(parse_duration('9:12:43'), 33163)
|
self.assertEqual(parse_duration('9:12:43'), 33163)
|
||||||
@ -220,6 +239,11 @@ class TestUtil(unittest.TestCase):
|
|||||||
self.assertEqual(parse_duration('0s'), 0)
|
self.assertEqual(parse_duration('0s'), 0)
|
||||||
self.assertEqual(parse_duration('01:02:03.05'), 3723.05)
|
self.assertEqual(parse_duration('01:02:03.05'), 3723.05)
|
||||||
self.assertEqual(parse_duration('T30M38S'), 1838)
|
self.assertEqual(parse_duration('T30M38S'), 1838)
|
||||||
|
self.assertEqual(parse_duration('5 s'), 5)
|
||||||
|
self.assertEqual(parse_duration('3 min'), 180)
|
||||||
|
self.assertEqual(parse_duration('2.5 hours'), 9000)
|
||||||
|
self.assertEqual(parse_duration('02:03:04'), 7384)
|
||||||
|
self.assertEqual(parse_duration('01:02:03:04'), 93784)
|
||||||
|
|
||||||
def test_fix_xml_ampersands(self):
|
def test_fix_xml_ampersands(self):
|
||||||
self.assertEqual(
|
self.assertEqual(
|
||||||
@ -346,6 +370,10 @@ class TestUtil(unittest.TestCase):
|
|||||||
"playlist":[{"controls":{"all":null}}]
|
"playlist":[{"controls":{"all":null}}]
|
||||||
}''')
|
}''')
|
||||||
|
|
||||||
|
inp = '"SAND Number: SAND 2013-7800P\\nPresenter: Tom Russo\\nHabanero Software Training - Xyce Software\\nXyce, Sandia\\u0027s"'
|
||||||
|
json_code = js_to_json(inp)
|
||||||
|
self.assertEqual(json.loads(json_code), json.loads(inp))
|
||||||
|
|
||||||
def test_js_to_json_edgecases(self):
|
def test_js_to_json_edgecases(self):
|
||||||
on = js_to_json("{abc_def:'1\\'\\\\2\\\\\\'3\"4'}")
|
on = js_to_json("{abc_def:'1\\'\\\\2\\\\\\'3\"4'}")
|
||||||
self.assertEqual(json.loads(on), {"abc_def": "1'\\2\\'3\"4"})
|
self.assertEqual(json.loads(on), {"abc_def": "1'\\2\\'3\"4"})
|
||||||
@ -353,6 +381,16 @@ class TestUtil(unittest.TestCase):
|
|||||||
on = js_to_json('{"abc": true}')
|
on = js_to_json('{"abc": true}')
|
||||||
self.assertEqual(json.loads(on), {'abc': True})
|
self.assertEqual(json.loads(on), {'abc': True})
|
||||||
|
|
||||||
|
# Ignore JavaScript code as well
|
||||||
|
on = js_to_json('''{
|
||||||
|
"x": 1,
|
||||||
|
y: "a",
|
||||||
|
z: some.code
|
||||||
|
}''')
|
||||||
|
d = json.loads(on)
|
||||||
|
self.assertEqual(d['x'], 1)
|
||||||
|
self.assertEqual(d['y'], 'a')
|
||||||
|
|
||||||
def test_clean_html(self):
|
def test_clean_html(self):
|
||||||
self.assertEqual(clean_html('a:\nb'), 'a: b')
|
self.assertEqual(clean_html('a:\nb'), 'a: b')
|
||||||
self.assertEqual(clean_html('a:\n "b"'), 'a: "b"')
|
self.assertEqual(clean_html('a:\n "b"'), 'a: "b"')
|
||||||
@ -376,6 +414,87 @@ class TestUtil(unittest.TestCase):
|
|||||||
self.assertEqual(parse_filesize('2 MiB'), 2097152)
|
self.assertEqual(parse_filesize('2 MiB'), 2097152)
|
||||||
self.assertEqual(parse_filesize('5 GB'), 5000000000)
|
self.assertEqual(parse_filesize('5 GB'), 5000000000)
|
||||||
self.assertEqual(parse_filesize('1.2Tb'), 1200000000000)
|
self.assertEqual(parse_filesize('1.2Tb'), 1200000000000)
|
||||||
|
self.assertEqual(parse_filesize('1,24 KB'), 1240)
|
||||||
|
|
||||||
|
def test_version_tuple(self):
|
||||||
|
self.assertEqual(version_tuple('1'), (1,))
|
||||||
|
self.assertEqual(version_tuple('10.23.344'), (10, 23, 344))
|
||||||
|
self.assertEqual(version_tuple('10.1-6'), (10, 1, 6)) # avconv style
|
||||||
|
|
||||||
|
def test_detect_exe_version(self):
|
||||||
|
self.assertEqual(detect_exe_version('''ffmpeg version 1.2.1
|
||||||
|
built on May 27 2013 08:37:26 with gcc 4.7 (Debian 4.7.3-4)
|
||||||
|
configuration: --prefix=/usr --extra-'''), '1.2.1')
|
||||||
|
self.assertEqual(detect_exe_version('''ffmpeg version N-63176-g1fb4685
|
||||||
|
built on May 15 2014 22:09:06 with gcc 4.8.2 (GCC)'''), 'N-63176-g1fb4685')
|
||||||
|
self.assertEqual(detect_exe_version('''X server found. dri2 connection failed!
|
||||||
|
Trying to open render node...
|
||||||
|
Success at /dev/dri/renderD128.
|
||||||
|
ffmpeg version 2.4.4 Copyright (c) 2000-2014 the FFmpeg ...'''), '2.4.4')
|
||||||
|
|
||||||
|
def test_age_restricted(self):
|
||||||
|
self.assertFalse(age_restricted(None, 10)) # unrestricted content
|
||||||
|
self.assertFalse(age_restricted(1, None)) # unrestricted policy
|
||||||
|
self.assertFalse(age_restricted(8, 10))
|
||||||
|
self.assertTrue(age_restricted(18, 14))
|
||||||
|
self.assertFalse(age_restricted(18, 18))
|
||||||
|
|
||||||
|
def test_is_html(self):
|
||||||
|
self.assertFalse(is_html(b'\x49\x44\x43<html'))
|
||||||
|
self.assertTrue(is_html(b'<!DOCTYPE foo>\xaaa'))
|
||||||
|
self.assertTrue(is_html( # UTF-8 with BOM
|
||||||
|
b'\xef\xbb\xbf<!DOCTYPE foo>\xaaa'))
|
||||||
|
self.assertTrue(is_html( # UTF-16-LE
|
||||||
|
b'\xff\xfe<\x00h\x00t\x00m\x00l\x00>\x00\xe4\x00'
|
||||||
|
))
|
||||||
|
self.assertTrue(is_html( # UTF-16-BE
|
||||||
|
b'\xfe\xff\x00<\x00h\x00t\x00m\x00l\x00>\x00\xe4'
|
||||||
|
))
|
||||||
|
self.assertTrue(is_html( # UTF-32-BE
|
||||||
|
b'\x00\x00\xFE\xFF\x00\x00\x00<\x00\x00\x00h\x00\x00\x00t\x00\x00\x00m\x00\x00\x00l\x00\x00\x00>\x00\x00\x00\xe4'))
|
||||||
|
self.assertTrue(is_html( # UTF-32-LE
|
||||||
|
b'\xFF\xFE\x00\x00<\x00\x00\x00h\x00\x00\x00t\x00\x00\x00m\x00\x00\x00l\x00\x00\x00>\x00\x00\x00\xe4\x00\x00\x00'))
|
||||||
|
|
||||||
|
def test_render_table(self):
|
||||||
|
self.assertEqual(
|
||||||
|
render_table(
|
||||||
|
['a', 'bcd'],
|
||||||
|
[[123, 4], [9999, 51]]),
|
||||||
|
'a bcd\n'
|
||||||
|
'123 4\n'
|
||||||
|
'9999 51')
|
||||||
|
|
||||||
|
def test_match_str(self):
|
||||||
|
self.assertRaises(ValueError, match_str, 'xy>foobar', {})
|
||||||
|
self.assertFalse(match_str('xy', {'x': 1200}))
|
||||||
|
self.assertTrue(match_str('!xy', {'x': 1200}))
|
||||||
|
self.assertTrue(match_str('x', {'x': 1200}))
|
||||||
|
self.assertFalse(match_str('!x', {'x': 1200}))
|
||||||
|
self.assertTrue(match_str('x', {'x': 0}))
|
||||||
|
self.assertFalse(match_str('x>0', {'x': 0}))
|
||||||
|
self.assertFalse(match_str('x>0', {}))
|
||||||
|
self.assertTrue(match_str('x>?0', {}))
|
||||||
|
self.assertTrue(match_str('x>1K', {'x': 1200}))
|
||||||
|
self.assertFalse(match_str('x>2K', {'x': 1200}))
|
||||||
|
self.assertTrue(match_str('x>=1200 & x < 1300', {'x': 1200}))
|
||||||
|
self.assertFalse(match_str('x>=1100 & x < 1200', {'x': 1200}))
|
||||||
|
self.assertFalse(match_str('y=a212', {'y': 'foobar42'}))
|
||||||
|
self.assertTrue(match_str('y=foobar42', {'y': 'foobar42'}))
|
||||||
|
self.assertFalse(match_str('y!=foobar42', {'y': 'foobar42'}))
|
||||||
|
self.assertTrue(match_str('y!=foobar2', {'y': 'foobar42'}))
|
||||||
|
self.assertFalse(match_str(
|
||||||
|
'like_count > 100 & dislike_count <? 50 & description',
|
||||||
|
{'like_count': 90, 'description': 'foo'}))
|
||||||
|
self.assertTrue(match_str(
|
||||||
|
'like_count > 100 & dislike_count <? 50 & description',
|
||||||
|
{'like_count': 190, 'description': 'foo'}))
|
||||||
|
self.assertFalse(match_str(
|
||||||
|
'like_count > 100 & dislike_count <? 50 & description',
|
||||||
|
{'like_count': 190, 'dislike_count': 60, 'description': 'foo'}))
|
||||||
|
self.assertFalse(match_str(
|
||||||
|
'like_count > 100 & dislike_count <? 50 & description',
|
||||||
|
{'like_count': 190, 'dislike_count': 10}))
|
||||||
|
|
||||||
|
|
||||||
if __name__ == '__main__':
|
if __name__ == '__main__':
|
||||||
unittest.main()
|
unittest.main()
|
||||||
|
@ -1,76 +0,0 @@
|
|||||||
#!/usr/bin/env python
|
|
||||||
# coding: utf-8
|
|
||||||
from __future__ import unicode_literals
|
|
||||||
|
|
||||||
# Allow direct execution
|
|
||||||
import os
|
|
||||||
import sys
|
|
||||||
import unittest
|
|
||||||
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
|
|
||||||
|
|
||||||
from test.helper import get_params
|
|
||||||
|
|
||||||
|
|
||||||
import io
|
|
||||||
import json
|
|
||||||
|
|
||||||
import youtube_dl.YoutubeDL
|
|
||||||
import youtube_dl.extractor
|
|
||||||
|
|
||||||
|
|
||||||
class YoutubeDL(youtube_dl.YoutubeDL):
|
|
||||||
def __init__(self, *args, **kwargs):
|
|
||||||
super(YoutubeDL, self).__init__(*args, **kwargs)
|
|
||||||
self.to_stderr = self.to_screen
|
|
||||||
|
|
||||||
params = get_params({
|
|
||||||
'writeinfojson': True,
|
|
||||||
'skip_download': True,
|
|
||||||
'writedescription': True,
|
|
||||||
})
|
|
||||||
|
|
||||||
|
|
||||||
TEST_ID = 'BaW_jenozKc'
|
|
||||||
INFO_JSON_FILE = TEST_ID + '.info.json'
|
|
||||||
DESCRIPTION_FILE = TEST_ID + '.mp4.description'
|
|
||||||
EXPECTED_DESCRIPTION = '''test chars: "'/\ä↭𝕐
|
|
||||||
test URL: https://github.com/rg3/youtube-dl/issues/1892
|
|
||||||
|
|
||||||
This is a test video for youtube-dl.
|
|
||||||
|
|
||||||
For more information, contact phihag@phihag.de .'''
|
|
||||||
|
|
||||||
|
|
||||||
class TestInfoJSON(unittest.TestCase):
|
|
||||||
def setUp(self):
|
|
||||||
# Clear old files
|
|
||||||
self.tearDown()
|
|
||||||
|
|
||||||
def test_info_json(self):
|
|
||||||
ie = youtube_dl.extractor.YoutubeIE()
|
|
||||||
ydl = YoutubeDL(params)
|
|
||||||
ydl.add_info_extractor(ie)
|
|
||||||
ydl.download([TEST_ID])
|
|
||||||
self.assertTrue(os.path.exists(INFO_JSON_FILE))
|
|
||||||
with io.open(INFO_JSON_FILE, 'r', encoding='utf-8') as jsonf:
|
|
||||||
jd = json.load(jsonf)
|
|
||||||
self.assertEqual(jd['upload_date'], '20121002')
|
|
||||||
self.assertEqual(jd['description'], EXPECTED_DESCRIPTION)
|
|
||||||
self.assertEqual(jd['id'], TEST_ID)
|
|
||||||
self.assertEqual(jd['extractor'], 'youtube')
|
|
||||||
self.assertEqual(jd['title'], '''youtube-dl test video "'/\ä↭𝕐''')
|
|
||||||
self.assertEqual(jd['uploader'], 'Philipp Hagemeister')
|
|
||||||
|
|
||||||
self.assertTrue(os.path.exists(DESCRIPTION_FILE))
|
|
||||||
with io.open(DESCRIPTION_FILE, 'r', encoding='utf-8') as descf:
|
|
||||||
descr = descf.read()
|
|
||||||
self.assertEqual(descr, EXPECTED_DESCRIPTION)
|
|
||||||
|
|
||||||
def tearDown(self):
|
|
||||||
if os.path.exists(INFO_JSON_FILE):
|
|
||||||
os.remove(INFO_JSON_FILE)
|
|
||||||
if os.path.exists(DESCRIPTION_FILE):
|
|
||||||
os.remove(DESCRIPTION_FILE)
|
|
||||||
|
|
||||||
if __name__ == '__main__':
|
|
||||||
unittest.main()
|
|
@ -8,11 +8,11 @@ import sys
|
|||||||
import unittest
|
import unittest
|
||||||
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
|
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
|
||||||
|
|
||||||
|
|
||||||
import io
|
import io
|
||||||
import re
|
import re
|
||||||
import string
|
import string
|
||||||
|
|
||||||
|
from test.helper import FakeYDL
|
||||||
from youtube_dl.extractor import YoutubeIE
|
from youtube_dl.extractor import YoutubeIE
|
||||||
from youtube_dl.compat import compat_str, compat_urlretrieve
|
from youtube_dl.compat import compat_str, compat_urlretrieve
|
||||||
|
|
||||||
@ -64,6 +64,12 @@ _TESTS = [
|
|||||||
'js',
|
'js',
|
||||||
'4646B5181C6C3020DF1D9C7FCFEA.AD80ABF70C39BD369CCCAE780AFBB98FA6B6CB42766249D9488C288',
|
'4646B5181C6C3020DF1D9C7FCFEA.AD80ABF70C39BD369CCCAE780AFBB98FA6B6CB42766249D9488C288',
|
||||||
'82C8849D94266724DC6B6AF89BBFA087EACCD963.B93C07FBA084ACAEFCF7C9D1FD0203C6C1815B6B'
|
'82C8849D94266724DC6B6AF89BBFA087EACCD963.B93C07FBA084ACAEFCF7C9D1FD0203C6C1815B6B'
|
||||||
|
),
|
||||||
|
(
|
||||||
|
'https://s.ytimg.com/yts/jsbin/html5player-en_US-vflKjOTVq/html5player.js',
|
||||||
|
'js',
|
||||||
|
'312AA52209E3623129A412D56A40F11CB0AF14AE.3EE09501CB14E3BCDC3B2AE808BF3F1D14E7FBF12',
|
||||||
|
'112AA5220913623229A412D56A40F11CB0AF14AE.3EE0950FCB14EEBCDC3B2AE808BF331D14E7FBF3',
|
||||||
)
|
)
|
||||||
]
|
]
|
||||||
|
|
||||||
@ -88,7 +94,8 @@ def make_tfunc(url, stype, sig_input, expected_sig):
|
|||||||
if not os.path.exists(fn):
|
if not os.path.exists(fn):
|
||||||
compat_urlretrieve(url, fn)
|
compat_urlretrieve(url, fn)
|
||||||
|
|
||||||
ie = YoutubeIE()
|
ydl = FakeYDL()
|
||||||
|
ie = YoutubeIE(ydl)
|
||||||
if stype == 'js':
|
if stype == 'js':
|
||||||
with io.open(fn, encoding='utf-8') as testf:
|
with io.open(fn, encoding='utf-8') as testf:
|
||||||
jscode = testf.read()
|
jscode = testf.read()
|
||||||
|
52
test/testcert.pem
Normal file
52
test/testcert.pem
Normal file
@ -0,0 +1,52 @@
|
|||||||
|
-----BEGIN PRIVATE KEY-----
|
||||||
|
MIIEvQIBADANBgkqhkiG9w0BAQEFAASCBKcwggSjAgEAAoIBAQDMF0bAzaHAdIyB
|
||||||
|
HRmnIp4vv40lGqEePmWqicCl0QZ0wsb5dNysSxSa7330M2QeQopGfdaUYF1uTcNp
|
||||||
|
Qx6ECgBSfg+RrOBI7r/u4F+sKX8MUXVaf/5QoBUrGNGSn/pp7HMGOuQqO6BVg4+h
|
||||||
|
A1ySSwUG8mZItLRry1ISyErmW8b9xlqfd97uLME/5tX+sMelRFjUbAx8A4CK58Ev
|
||||||
|
mMguHVTlXzx5RMdYcf1VScYcjlV/qA45uzP8zwI5aigfcmUD+tbGuQRhKxUhmw0J
|
||||||
|
aobtOR6+JSOAULW5gYa/egE4dWLwbyM6b6eFbdnjlQzEA1EW7ChMPAW/Mo83KyiP
|
||||||
|
tKMCSQulAgMBAAECggEALCfBDAexPjU5DNoh6bIorUXxIJzxTNzNHCdvgbCGiA54
|
||||||
|
BBKPh8s6qwazpnjT6WQWDIg/O5zZufqjE4wM9x4+0Zoqfib742ucJO9wY4way6x4
|
||||||
|
Clt0xzbLPabB+MoZ4H7ip+9n2+dImhe7pGdYyOHoNYeOL57BBi1YFW42Hj6u/8pd
|
||||||
|
63YCXisto3Rz1YvRQVjwsrS+cRKZlzAFQRviL30jav7Wh1aWEfcXxjj4zhm8pJdk
|
||||||
|
ITGtq6howz57M0NtX6hZnfe8ywzTnDFIGKIMA2cYHuYJcBh9bc4tCGubTvTKK9UE
|
||||||
|
8fM+f6UbfGqfpKCq1mcgs0XMoFDSzKS9+mSJn0+5JQKBgQD+OCKaeH3Yzw5zGnlw
|
||||||
|
XuQfMJGNcgNr+ImjmvzUAC2fAZUJLAcQueE5kzMv5Fmd+EFE2CEX1Vit3tg0SXvA
|
||||||
|
G+bq609doILHMA03JHnV1npO/YNIhG3AAtJlKYGxQNfWH9mflYj9mEui8ZFxG52o
|
||||||
|
zWhHYuifOjjZszUR+/eio6NPzwKBgQDNhUBTrT8LIX4SE/EFUiTlYmWIvOMgXYvN
|
||||||
|
8Cm3IRNQ/yyphZaXEU0eJzfX5uCDfSVOgd6YM/2pRah+t+1Hvey4H8e0GVTu5wMP
|
||||||
|
gkkqwKPGIR1YOmlw6ippqwvoJD7LuYrm6Q4D6e1PvkjwCq6lEndrOPmPrrXNd0JJ
|
||||||
|
XO60y3U2SwKBgQDLkyZarryQXxcCI6Q10Tc6pskYDMIit095PUbTeiUOXNT9GE28
|
||||||
|
Hi32ziLCakk9kCysNasii81MxtQ54tJ/f5iGbNMMddnkKl2a19Hc5LjjAm4cJzg/
|
||||||
|
98KGEhvyVqvAo5bBDZ06/rcrD+lZOzUglQS5jcIcqCIYa0LHWQ/wJLxFzwKBgFcZ
|
||||||
|
1SRhdSmDfUmuF+S4ZpistflYjC3IV5rk4NkS9HvMWaJS0nqdw4A3AMzItXgkjq4S
|
||||||
|
DkOVLTkTI5Do5HAWRv/VwC5M2hkR4NMu1VGAKSisGiKtRsirBWSZMEenLNHshbjN
|
||||||
|
Jrpz5rZ4H7NT46ZkCCZyFBpX4gb9NyOedjA7Via3AoGARF8RxbYjnEGGFuhnbrJB
|
||||||
|
FTPR0vaL4faY3lOgRZ8jOG9V2c9Hzi/y8a8TU4C11jnJSDqYCXBTd5XN28npYxtD
|
||||||
|
pjRsCwy6ze+yvYXPO7C978eMG3YRyj366NXUxnXN59ibwe/lxi2OD9z8J1LEdF6z
|
||||||
|
VJua1Wn8HKxnXMI61DhTCSo=
|
||||||
|
-----END PRIVATE KEY-----
|
||||||
|
-----BEGIN CERTIFICATE-----
|
||||||
|
MIIEEzCCAvugAwIBAgIJAK1haYi6gmSKMA0GCSqGSIb3DQEBCwUAMIGeMQswCQYD
|
||||||
|
VQQGEwJERTEMMAoGA1UECAwDTlJXMRQwEgYDVQQHDAtEdWVzc2VsZG9yZjEbMBkG
|
||||||
|
A1UECgwSeW91dHViZS1kbCBwcm9qZWN0MRkwFwYDVQQLDBB5b3V0dWJlLWRsIHRl
|
||||||
|
c3RzMRIwEAYDVQQDDAlsb2NhbGhvc3QxHzAdBgkqhkiG9w0BCQEWEHBoaWhhZ0Bw
|
||||||
|
aGloYWcuZGUwIBcNMTUwMTMwMDExNTA4WhgPMjExNTAxMDYwMTE1MDhaMIGeMQsw
|
||||||
|
CQYDVQQGEwJERTEMMAoGA1UECAwDTlJXMRQwEgYDVQQHDAtEdWVzc2VsZG9yZjEb
|
||||||
|
MBkGA1UECgwSeW91dHViZS1kbCBwcm9qZWN0MRkwFwYDVQQLDBB5b3V0dWJlLWRs
|
||||||
|
IHRlc3RzMRIwEAYDVQQDDAlsb2NhbGhvc3QxHzAdBgkqhkiG9w0BCQEWEHBoaWhh
|
||||||
|
Z0BwaGloYWcuZGUwggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQDMF0bA
|
||||||
|
zaHAdIyBHRmnIp4vv40lGqEePmWqicCl0QZ0wsb5dNysSxSa7330M2QeQopGfdaU
|
||||||
|
YF1uTcNpQx6ECgBSfg+RrOBI7r/u4F+sKX8MUXVaf/5QoBUrGNGSn/pp7HMGOuQq
|
||||||
|
O6BVg4+hA1ySSwUG8mZItLRry1ISyErmW8b9xlqfd97uLME/5tX+sMelRFjUbAx8
|
||||||
|
A4CK58EvmMguHVTlXzx5RMdYcf1VScYcjlV/qA45uzP8zwI5aigfcmUD+tbGuQRh
|
||||||
|
KxUhmw0JaobtOR6+JSOAULW5gYa/egE4dWLwbyM6b6eFbdnjlQzEA1EW7ChMPAW/
|
||||||
|
Mo83KyiPtKMCSQulAgMBAAGjUDBOMB0GA1UdDgQWBBTBUZoqhQkzHQ6xNgZfFxOd
|
||||||
|
ZEVt8TAfBgNVHSMEGDAWgBTBUZoqhQkzHQ6xNgZfFxOdZEVt8TAMBgNVHRMEBTAD
|
||||||
|
AQH/MA0GCSqGSIb3DQEBCwUAA4IBAQCUOCl3T/J9B08Z+ijfOJAtkbUaEHuVZb4x
|
||||||
|
5EpZSy2ZbkLvtsftMFieHVNXn9dDswQc5qjYStCC4o60LKw4M6Y63FRsAZ/DNaqb
|
||||||
|
PY3jyCyuugZ8/sNf50vHYkAcF7SQYqOQFQX4TQsNUk2xMJIt7H0ErQFmkf/u3dg6
|
||||||
|
cy89zkT462IwxzSG7NNhIlRkL9o5qg+Y1mF9eZA1B0rcL6hO24PPTHOd90HDChBu
|
||||||
|
SZ6XMi/LzYQSTf0Vg2R+uMIVlzSlkdcZ6sqVnnqeLL8dFyIa4e9sj/D4ZCYP8Mqe
|
||||||
|
Z73H5/NNhmwCHRqVUTgm307xblQaWGhwAiDkaRvRW2aJQ0qGEdZK
|
||||||
|
-----END CERTIFICATE-----
|
@ -7,8 +7,10 @@ import collections
|
|||||||
import datetime
|
import datetime
|
||||||
import errno
|
import errno
|
||||||
import io
|
import io
|
||||||
|
import itertools
|
||||||
import json
|
import json
|
||||||
import locale
|
import locale
|
||||||
|
import operator
|
||||||
import os
|
import os
|
||||||
import platform
|
import platform
|
||||||
import re
|
import re
|
||||||
@ -23,9 +25,11 @@ if os.name == 'nt':
|
|||||||
import ctypes
|
import ctypes
|
||||||
|
|
||||||
from .compat import (
|
from .compat import (
|
||||||
|
compat_basestring,
|
||||||
compat_cookiejar,
|
compat_cookiejar,
|
||||||
compat_expanduser,
|
compat_expanduser,
|
||||||
compat_http_client,
|
compat_http_client,
|
||||||
|
compat_kwargs,
|
||||||
compat_str,
|
compat_str,
|
||||||
compat_urllib_error,
|
compat_urllib_error,
|
||||||
compat_urllib_request,
|
compat_urllib_request,
|
||||||
@ -47,27 +51,38 @@ from .utils import (
|
|||||||
make_HTTPS_handler,
|
make_HTTPS_handler,
|
||||||
MaxDownloadsReached,
|
MaxDownloadsReached,
|
||||||
PagedList,
|
PagedList,
|
||||||
|
parse_filesize,
|
||||||
PostProcessingError,
|
PostProcessingError,
|
||||||
platform_name,
|
platform_name,
|
||||||
preferredencoding,
|
preferredencoding,
|
||||||
|
render_table,
|
||||||
SameFileError,
|
SameFileError,
|
||||||
sanitize_filename,
|
sanitize_filename,
|
||||||
|
std_headers,
|
||||||
subtitles_filename,
|
subtitles_filename,
|
||||||
build_part_filename,
|
build_part_filename,
|
||||||
takewhile_inclusive,
|
takewhile_inclusive,
|
||||||
UnavailableVideoError,
|
UnavailableVideoError,
|
||||||
url_basename,
|
url_basename,
|
||||||
|
version_tuple,
|
||||||
write_json_file,
|
write_json_file,
|
||||||
write_string,
|
write_string,
|
||||||
YoutubeDLHandler,
|
YoutubeDLHandler,
|
||||||
prepend_extension,
|
prepend_extension,
|
||||||
args_to_str,
|
args_to_str,
|
||||||
|
age_restricted,
|
||||||
)
|
)
|
||||||
from .cache import Cache
|
from .cache import Cache
|
||||||
from .extractor import get_info_extractor, gen_extractors
|
from .extractor import get_info_extractor, gen_extractors
|
||||||
from .downloader import get_suitable_downloader
|
from .downloader import get_suitable_downloader
|
||||||
from .downloader.rtmp import rtmpdump_version
|
from .downloader.rtmp import rtmpdump_version
|
||||||
from .postprocessor import FFmpegMergerPP, FFmpegPostProcessor
|
from .postprocessor import (
|
||||||
|
FFmpegFixupM4aPP,
|
||||||
|
FFmpegFixupStretchedPP,
|
||||||
|
FFmpegMergerPP,
|
||||||
|
FFmpegPostProcessor,
|
||||||
|
get_postprocessor,
|
||||||
|
)
|
||||||
from .version import __version__
|
from .version import __version__
|
||||||
|
|
||||||
|
|
||||||
@ -116,7 +131,7 @@ class YoutubeDL(object):
|
|||||||
dump_single_json: Force printing the info_dict of the whole playlist
|
dump_single_json: Force printing the info_dict of the whole playlist
|
||||||
(or video) as a single JSON line.
|
(or video) as a single JSON line.
|
||||||
simulate: Do not download the video files.
|
simulate: Do not download the video files.
|
||||||
format: Video format code.
|
format: Video format code. See options.py for more information.
|
||||||
format_limit: Highest quality format to try.
|
format_limit: Highest quality format to try.
|
||||||
outtmpl: Template for output names.
|
outtmpl: Template for output names.
|
||||||
restrictfilenames: Do not allow "&" and spaces in file names
|
restrictfilenames: Do not allow "&" and spaces in file names
|
||||||
@ -124,6 +139,8 @@ class YoutubeDL(object):
|
|||||||
nooverwrites: Prevent overwriting files.
|
nooverwrites: Prevent overwriting files.
|
||||||
playliststart: Playlist item to start at.
|
playliststart: Playlist item to start at.
|
||||||
playlistend: Playlist item to end at.
|
playlistend: Playlist item to end at.
|
||||||
|
playlist_items: Specific indices of playlist to download.
|
||||||
|
playlistreverse: Download playlist items in reverse order.
|
||||||
matchtitle: Download only matching titles.
|
matchtitle: Download only matching titles.
|
||||||
rejecttitle: Reject downloads for matching titles.
|
rejecttitle: Reject downloads for matching titles.
|
||||||
logger: Log messages to a logging.Logger instance.
|
logger: Log messages to a logging.Logger instance.
|
||||||
@ -132,6 +149,7 @@ class YoutubeDL(object):
|
|||||||
writeinfojson: Write the video description to a .info.json file
|
writeinfojson: Write the video description to a .info.json file
|
||||||
writeannotations: Write the video annotations to a .annotations.xml file
|
writeannotations: Write the video annotations to a .annotations.xml file
|
||||||
writethumbnail: Write the thumbnail image to a file
|
writethumbnail: Write the thumbnail image to a file
|
||||||
|
write_all_thumbnails: Write all thumbnail formats to files
|
||||||
writesubtitles: Write the video subtitles to a file
|
writesubtitles: Write the video subtitles to a file
|
||||||
writeautomaticsub: Write the automatic subtitles to a file
|
writeautomaticsub: Write the automatic subtitles to a file
|
||||||
allsubtitles: Downloads all the subtitles of the video
|
allsubtitles: Downloads all the subtitles of the video
|
||||||
@ -175,11 +193,65 @@ class YoutubeDL(object):
|
|||||||
extract_flat: Do not resolve URLs, return the immediate result.
|
extract_flat: Do not resolve URLs, return the immediate result.
|
||||||
Pass in 'in_playlist' to only show this behavior for
|
Pass in 'in_playlist' to only show this behavior for
|
||||||
playlist items.
|
playlist items.
|
||||||
|
postprocessors: A list of dictionaries, each with an entry
|
||||||
|
* key: The name of the postprocessor. See
|
||||||
|
youtube_dl/postprocessor/__init__.py for a list.
|
||||||
|
as well as any further keyword arguments for the
|
||||||
|
postprocessor.
|
||||||
|
progress_hooks: A list of functions that get called on download
|
||||||
|
progress, with a dictionary with the entries
|
||||||
|
* status: One of "downloading", "error", or "finished".
|
||||||
|
Check this first and ignore unknown values.
|
||||||
|
|
||||||
|
If status is one of "downloading", or "finished", the
|
||||||
|
following properties may also be present:
|
||||||
|
* filename: The final filename (always present)
|
||||||
|
* tmpfilename: The filename we're currently writing to
|
||||||
|
* downloaded_bytes: Bytes on disk
|
||||||
|
* total_bytes: Size of the whole file, None if unknown
|
||||||
|
* total_bytes_estimate: Guess of the eventual file size,
|
||||||
|
None if unavailable.
|
||||||
|
* elapsed: The number of seconds since download started.
|
||||||
|
* eta: The estimated time in seconds, None if unknown
|
||||||
|
* speed: The download speed in bytes/second, None if
|
||||||
|
unknown
|
||||||
|
* fragment_index: The counter of the currently
|
||||||
|
downloaded video fragment.
|
||||||
|
* fragment_count: The number of fragments (= individual
|
||||||
|
files that will be merged)
|
||||||
|
|
||||||
|
Progress hooks are guaranteed to be called at least once
|
||||||
|
(with status "finished") if the download is successful.
|
||||||
|
merge_output_format: Extension to use when merging formats.
|
||||||
|
fixup: Automatically correct known faults of the file.
|
||||||
|
One of:
|
||||||
|
- "never": do nothing
|
||||||
|
- "warn": only emit a warning
|
||||||
|
- "detect_or_warn": check whether we can do anything
|
||||||
|
about it, warn otherwise (default)
|
||||||
|
source_address: (Experimental) Client-side IP address to bind to.
|
||||||
|
call_home: Boolean, true iff we are allowed to contact the
|
||||||
|
youtube-dl servers for debugging.
|
||||||
|
sleep_interval: Number of seconds to sleep before each download.
|
||||||
|
listformats: Print an overview of available video formats and exit.
|
||||||
|
list_thumbnails: Print a table of all thumbnails and exit.
|
||||||
|
match_filter: A function that gets called with the info_dict of
|
||||||
|
every video.
|
||||||
|
If it returns a message, the video is ignored.
|
||||||
|
If it returns None, the video is downloaded.
|
||||||
|
match_filter_func in utils.py is one example for this.
|
||||||
|
no_color: Do not emit color codes in output.
|
||||||
|
|
||||||
|
The following options determine which downloader is picked:
|
||||||
|
external_downloader: Executable of the external downloader to call.
|
||||||
|
None or unset for standard (built-in) downloader.
|
||||||
|
hls_prefer_native: Use the native HLS downloader instead of ffmpeg/avconv.
|
||||||
|
|
||||||
The following parameters are not used by YoutubeDL itself, they are used by
|
The following parameters are not used by YoutubeDL itself, they are used by
|
||||||
the FileDownloader:
|
the FileDownloader:
|
||||||
nopart, updatetime, buffersize, ratelimit, min_filesize, max_filesize, test,
|
nopart, updatetime, buffersize, ratelimit, min_filesize, max_filesize, test,
|
||||||
noresizebuffer, retries, continuedl, noprogress, consoletitle
|
noresizebuffer, retries, continuedl, noprogress, consoletitle,
|
||||||
|
xattr_set_filesize.
|
||||||
|
|
||||||
The following options are used by the post processors:
|
The following options are used by the post processors:
|
||||||
prefer_ffmpeg: If True, use ffmpeg instead of avconv if both are available,
|
prefer_ffmpeg: If True, use ffmpeg instead of avconv if both are available,
|
||||||
@ -255,6 +327,16 @@ class YoutubeDL(object):
|
|||||||
self.print_debug_header()
|
self.print_debug_header()
|
||||||
self.add_default_info_extractors()
|
self.add_default_info_extractors()
|
||||||
|
|
||||||
|
for pp_def_raw in self.params.get('postprocessors', []):
|
||||||
|
pp_class = get_postprocessor(pp_def_raw['key'])
|
||||||
|
pp_def = dict(pp_def_raw)
|
||||||
|
del pp_def['key']
|
||||||
|
pp = pp_class(self, **compat_kwargs(pp_def))
|
||||||
|
self.add_post_processor(pp)
|
||||||
|
|
||||||
|
for ph in self.params.get('progress_hooks', []):
|
||||||
|
self.add_progress_hook(ph)
|
||||||
|
|
||||||
def warn_if_short_id(self, argv):
|
def warn_if_short_id(self, argv):
|
||||||
# short YouTube ID starting with dash?
|
# short YouTube ID starting with dash?
|
||||||
idxs = [
|
idxs = [
|
||||||
@ -420,7 +502,7 @@ class YoutubeDL(object):
|
|||||||
else:
|
else:
|
||||||
if self.params.get('no_warnings'):
|
if self.params.get('no_warnings'):
|
||||||
return
|
return
|
||||||
if self._err_file.isatty() and os.name != 'nt':
|
if not self.params.get('no_color') and self._err_file.isatty() and os.name != 'nt':
|
||||||
_msg_header = '\033[0;33mWARNING:\033[0m'
|
_msg_header = '\033[0;33mWARNING:\033[0m'
|
||||||
else:
|
else:
|
||||||
_msg_header = 'WARNING:'
|
_msg_header = 'WARNING:'
|
||||||
@ -432,7 +514,7 @@ class YoutubeDL(object):
|
|||||||
Do the same as trouble, but prefixes the message with 'ERROR:', colored
|
Do the same as trouble, but prefixes the message with 'ERROR:', colored
|
||||||
in red if stderr is a tty file.
|
in red if stderr is a tty file.
|
||||||
'''
|
'''
|
||||||
if self._err_file.isatty() and os.name != 'nt':
|
if not self.params.get('no_color') and self._err_file.isatty() and os.name != 'nt':
|
||||||
_msg_header = '\033[0;31mERROR:\033[0m'
|
_msg_header = '\033[0;31mERROR:\033[0m'
|
||||||
else:
|
else:
|
||||||
_msg_header = 'ERROR:'
|
_msg_header = 'ERROR:'
|
||||||
@ -479,12 +561,17 @@ class YoutubeDL(object):
|
|||||||
outtmpl = self.params.get('outtmpl', DEFAULT_OUTTMPL)
|
outtmpl = self.params.get('outtmpl', DEFAULT_OUTTMPL)
|
||||||
tmpl = compat_expanduser(outtmpl)
|
tmpl = compat_expanduser(outtmpl)
|
||||||
filename = tmpl % template_dict
|
filename = tmpl % template_dict
|
||||||
|
# Temporary fix for #4787
|
||||||
|
# 'Treat' all problem characters by passing filename through preferredencoding
|
||||||
|
# to workaround encoding issues with subprocess on python2 @ Windows
|
||||||
|
if sys.version_info < (3, 0) and sys.platform == 'win32':
|
||||||
|
filename = encodeFilename(filename, True).decode(preferredencoding())
|
||||||
return filename
|
return filename
|
||||||
except ValueError as err:
|
except ValueError as err:
|
||||||
self.report_error('Error in output template: ' + str(err) + ' (encoding: ' + repr(preferredencoding()) + ')')
|
self.report_error('Error in output template: ' + str(err) + ' (encoding: ' + repr(preferredencoding()) + ')')
|
||||||
return None
|
return None
|
||||||
|
|
||||||
def _match_entry(self, info_dict):
|
def _match_entry(self, info_dict, incomplete):
|
||||||
""" Returns None iff the file should be downloaded """
|
""" Returns None iff the file should be downloaded """
|
||||||
|
|
||||||
video_title = info_dict.get('title', info_dict.get('id', 'video'))
|
video_title = info_dict.get('title', info_dict.get('id', 'video'))
|
||||||
@ -512,15 +599,18 @@ class YoutubeDL(object):
|
|||||||
max_views = self.params.get('max_views')
|
max_views = self.params.get('max_views')
|
||||||
if max_views is not None and view_count > max_views:
|
if max_views is not None and view_count > max_views:
|
||||||
return 'Skipping %s, because it has exceeded the maximum view count (%d/%d)' % (video_title, view_count, max_views)
|
return 'Skipping %s, because it has exceeded the maximum view count (%d/%d)' % (video_title, view_count, max_views)
|
||||||
age_limit = self.params.get('age_limit')
|
if age_restricted(info_dict.get('age_limit'), self.params.get('age_limit')):
|
||||||
if age_limit is not None:
|
return 'Skipping "%s" because it is age restricted' % video_title
|
||||||
actual_age_limit = info_dict.get('age_limit')
|
|
||||||
if actual_age_limit is None:
|
|
||||||
actual_age_limit = 0
|
|
||||||
if age_limit < actual_age_limit:
|
|
||||||
return 'Skipping "' + title + '" because it is age restricted'
|
|
||||||
if self.in_download_archive(info_dict):
|
if self.in_download_archive(info_dict):
|
||||||
return '%s has already been recorded in archive' % video_title
|
return '%s has already been recorded in archive' % video_title
|
||||||
|
|
||||||
|
if not incomplete:
|
||||||
|
match_filter = self.params.get('match_filter')
|
||||||
|
if match_filter is not None:
|
||||||
|
ret = match_filter(info_dict)
|
||||||
|
if ret is not None:
|
||||||
|
return ret
|
||||||
|
|
||||||
return None
|
return None
|
||||||
|
|
||||||
@staticmethod
|
@staticmethod
|
||||||
@ -622,23 +712,15 @@ class YoutubeDL(object):
|
|||||||
ie_result['url'], ie_key=ie_result.get('ie_key'),
|
ie_result['url'], ie_key=ie_result.get('ie_key'),
|
||||||
extra_info=extra_info, download=False, process=False)
|
extra_info=extra_info, download=False, process=False)
|
||||||
|
|
||||||
def make_result(embedded_info):
|
force_properties = dict(
|
||||||
new_result = ie_result.copy()
|
(k, v) for k, v in ie_result.items() if v is not None)
|
||||||
for f in ('_type', 'url', 'ext', 'player_url', 'formats',
|
for f in ('_type', 'url'):
|
||||||
'entries', 'ie_key', 'duration',
|
if f in force_properties:
|
||||||
'subtitles', 'annotations', 'format',
|
del force_properties[f]
|
||||||
'thumbnail', 'thumbnails'):
|
new_result = info.copy()
|
||||||
if f in new_result:
|
new_result.update(force_properties)
|
||||||
del new_result[f]
|
|
||||||
if f in embedded_info:
|
|
||||||
new_result[f] = embedded_info[f]
|
|
||||||
return new_result
|
|
||||||
new_result = make_result(info)
|
|
||||||
|
|
||||||
assert new_result.get('_type') != 'url_transparent'
|
assert new_result.get('_type') != 'url_transparent'
|
||||||
if new_result.get('_type') == 'compat_list':
|
|
||||||
new_result['entries'] = [
|
|
||||||
make_result(e) for e in new_result['entries']]
|
|
||||||
|
|
||||||
return self.process_ie_result(
|
return self.process_ie_result(
|
||||||
new_result, download=download, extra_info=extra_info)
|
new_result, download=download, extra_info=extra_info)
|
||||||
@ -655,24 +737,61 @@ class YoutubeDL(object):
|
|||||||
if playlistend == -1:
|
if playlistend == -1:
|
||||||
playlistend = None
|
playlistend = None
|
||||||
|
|
||||||
if isinstance(ie_result['entries'], list):
|
playlistitems_str = self.params.get('playlist_items', None)
|
||||||
n_all_entries = len(ie_result['entries'])
|
playlistitems = None
|
||||||
entries = ie_result['entries'][playliststart:playlistend]
|
if playlistitems_str is not None:
|
||||||
|
def iter_playlistitems(format):
|
||||||
|
for string_segment in format.split(','):
|
||||||
|
if '-' in string_segment:
|
||||||
|
start, end = string_segment.split('-')
|
||||||
|
for item in range(int(start), int(end) + 1):
|
||||||
|
yield int(item)
|
||||||
|
else:
|
||||||
|
yield int(string_segment)
|
||||||
|
playlistitems = iter_playlistitems(playlistitems_str)
|
||||||
|
|
||||||
|
ie_entries = ie_result['entries']
|
||||||
|
if isinstance(ie_entries, list):
|
||||||
|
n_all_entries = len(ie_entries)
|
||||||
|
if playlistitems:
|
||||||
|
entries = [ie_entries[i - 1] for i in playlistitems]
|
||||||
|
else:
|
||||||
|
entries = ie_entries[playliststart:playlistend]
|
||||||
n_entries = len(entries)
|
n_entries = len(entries)
|
||||||
self.to_screen(
|
self.to_screen(
|
||||||
"[%s] playlist %s: Collected %d video ids (downloading %d of them)" %
|
"[%s] playlist %s: Collected %d video ids (downloading %d of them)" %
|
||||||
(ie_result['extractor'], playlist, n_all_entries, n_entries))
|
(ie_result['extractor'], playlist, n_all_entries, n_entries))
|
||||||
else:
|
elif isinstance(ie_entries, PagedList):
|
||||||
assert isinstance(ie_result['entries'], PagedList)
|
if playlistitems:
|
||||||
entries = ie_result['entries'].getslice(
|
entries = []
|
||||||
playliststart, playlistend)
|
for item in playlistitems:
|
||||||
|
entries.extend(ie_entries.getslice(
|
||||||
|
item - 1, item
|
||||||
|
))
|
||||||
|
else:
|
||||||
|
entries = ie_entries.getslice(
|
||||||
|
playliststart, playlistend)
|
||||||
|
n_entries = len(entries)
|
||||||
|
self.to_screen(
|
||||||
|
"[%s] playlist %s: Downloading %d videos" %
|
||||||
|
(ie_result['extractor'], playlist, n_entries))
|
||||||
|
else: # iterable
|
||||||
|
if playlistitems:
|
||||||
|
entry_list = list(ie_entries)
|
||||||
|
entries = [entry_list[i - 1] for i in playlistitems]
|
||||||
|
else:
|
||||||
|
entries = list(itertools.islice(
|
||||||
|
ie_entries, playliststart, playlistend))
|
||||||
n_entries = len(entries)
|
n_entries = len(entries)
|
||||||
self.to_screen(
|
self.to_screen(
|
||||||
"[%s] playlist %s: Downloading %d videos" %
|
"[%s] playlist %s: Downloading %d videos" %
|
||||||
(ie_result['extractor'], playlist, n_entries))
|
(ie_result['extractor'], playlist, n_entries))
|
||||||
|
|
||||||
|
if self.params.get('playlistreverse', False):
|
||||||
|
entries = entries[::-1]
|
||||||
|
|
||||||
for i, entry in enumerate(entries, 1):
|
for i, entry in enumerate(entries, 1):
|
||||||
self.to_screen('[download] Downloading video #%s of %s' % (i, n_entries))
|
self.to_screen('[download] Downloading video %s of %s' % (i, n_entries))
|
||||||
extra = {
|
extra = {
|
||||||
'n_entries': n_entries,
|
'n_entries': n_entries,
|
||||||
'playlist': playlist,
|
'playlist': playlist,
|
||||||
@ -685,7 +804,7 @@ class YoutubeDL(object):
|
|||||||
'extractor_key': ie_result['extractor_key'],
|
'extractor_key': ie_result['extractor_key'],
|
||||||
}
|
}
|
||||||
|
|
||||||
reason = self._match_entry(entry)
|
reason = self._match_entry(entry, incomplete=True)
|
||||||
if reason is not None:
|
if reason is not None:
|
||||||
self.to_screen('[download] ' + reason)
|
self.to_screen('[download] ' + reason)
|
||||||
continue
|
continue
|
||||||
@ -720,7 +839,76 @@ class YoutubeDL(object):
|
|||||||
else:
|
else:
|
||||||
raise Exception('Invalid result type: %s' % result_type)
|
raise Exception('Invalid result type: %s' % result_type)
|
||||||
|
|
||||||
|
def _apply_format_filter(self, format_spec, available_formats):
|
||||||
|
" Returns a tuple of the remaining format_spec and filtered formats "
|
||||||
|
|
||||||
|
OPERATORS = {
|
||||||
|
'<': operator.lt,
|
||||||
|
'<=': operator.le,
|
||||||
|
'>': operator.gt,
|
||||||
|
'>=': operator.ge,
|
||||||
|
'=': operator.eq,
|
||||||
|
'!=': operator.ne,
|
||||||
|
}
|
||||||
|
operator_rex = re.compile(r'''(?x)\s*\[
|
||||||
|
(?P<key>width|height|tbr|abr|vbr|asr|filesize|fps)
|
||||||
|
\s*(?P<op>%s)(?P<none_inclusive>\s*\?)?\s*
|
||||||
|
(?P<value>[0-9.]+(?:[kKmMgGtTpPeEzZyY]i?[Bb]?)?)
|
||||||
|
\]$
|
||||||
|
''' % '|'.join(map(re.escape, OPERATORS.keys())))
|
||||||
|
m = operator_rex.search(format_spec)
|
||||||
|
if m:
|
||||||
|
try:
|
||||||
|
comparison_value = int(m.group('value'))
|
||||||
|
except ValueError:
|
||||||
|
comparison_value = parse_filesize(m.group('value'))
|
||||||
|
if comparison_value is None:
|
||||||
|
comparison_value = parse_filesize(m.group('value') + 'B')
|
||||||
|
if comparison_value is None:
|
||||||
|
raise ValueError(
|
||||||
|
'Invalid value %r in format specification %r' % (
|
||||||
|
m.group('value'), format_spec))
|
||||||
|
op = OPERATORS[m.group('op')]
|
||||||
|
|
||||||
|
if not m:
|
||||||
|
STR_OPERATORS = {
|
||||||
|
'=': operator.eq,
|
||||||
|
'!=': operator.ne,
|
||||||
|
}
|
||||||
|
str_operator_rex = re.compile(r'''(?x)\s*\[
|
||||||
|
\s*(?P<key>ext|acodec|vcodec|container|protocol)
|
||||||
|
\s*(?P<op>%s)(?P<none_inclusive>\s*\?)?
|
||||||
|
\s*(?P<value>[a-zA-Z0-9_-]+)
|
||||||
|
\s*\]$
|
||||||
|
''' % '|'.join(map(re.escape, STR_OPERATORS.keys())))
|
||||||
|
m = str_operator_rex.search(format_spec)
|
||||||
|
if m:
|
||||||
|
comparison_value = m.group('value')
|
||||||
|
op = STR_OPERATORS[m.group('op')]
|
||||||
|
|
||||||
|
if not m:
|
||||||
|
raise ValueError('Invalid format specification %r' % format_spec)
|
||||||
|
|
||||||
|
def _filter(f):
|
||||||
|
actual_value = f.get(m.group('key'))
|
||||||
|
if actual_value is None:
|
||||||
|
return m.group('none_inclusive')
|
||||||
|
return op(actual_value, comparison_value)
|
||||||
|
new_formats = [f for f in available_formats if _filter(f)]
|
||||||
|
|
||||||
|
new_format_spec = format_spec[:-len(m.group(0))]
|
||||||
|
if not new_format_spec:
|
||||||
|
new_format_spec = 'best'
|
||||||
|
|
||||||
|
return (new_format_spec, new_formats)
|
||||||
|
|
||||||
def select_format(self, format_spec, available_formats):
|
def select_format(self, format_spec, available_formats):
|
||||||
|
while format_spec.endswith(']'):
|
||||||
|
format_spec, available_formats = self._apply_format_filter(
|
||||||
|
format_spec, available_formats)
|
||||||
|
if not available_formats:
|
||||||
|
return None
|
||||||
|
|
||||||
if format_spec == 'best' or format_spec is None:
|
if format_spec == 'best' or format_spec is None:
|
||||||
return available_formats[-1]
|
return available_formats[-1]
|
||||||
elif format_spec == 'worst':
|
elif format_spec == 'worst':
|
||||||
@ -750,7 +938,7 @@ class YoutubeDL(object):
|
|||||||
if video_formats:
|
if video_formats:
|
||||||
return video_formats[0]
|
return video_formats[0]
|
||||||
else:
|
else:
|
||||||
extensions = ['mp4', 'flv', 'webm', '3gp', 'm4a']
|
extensions = ['mp4', 'flv', 'webm', '3gp', 'm4a', 'mp3', 'ogg', 'aac', 'wav']
|
||||||
if format_spec in extensions:
|
if format_spec in extensions:
|
||||||
filter_f = lambda f: f['ext'] == format_spec
|
filter_f = lambda f: f['ext'] == format_spec
|
||||||
else:
|
else:
|
||||||
@ -760,6 +948,24 @@ class YoutubeDL(object):
|
|||||||
return matches[-1]
|
return matches[-1]
|
||||||
return None
|
return None
|
||||||
|
|
||||||
|
def _calc_headers(self, info_dict):
|
||||||
|
res = std_headers.copy()
|
||||||
|
|
||||||
|
add_headers = info_dict.get('http_headers')
|
||||||
|
if add_headers:
|
||||||
|
res.update(add_headers)
|
||||||
|
|
||||||
|
cookies = self._calc_cookies(info_dict)
|
||||||
|
if cookies:
|
||||||
|
res['Cookie'] = cookies
|
||||||
|
|
||||||
|
return res
|
||||||
|
|
||||||
|
def _calc_cookies(self, info_dict):
|
||||||
|
pr = compat_urllib_request.Request(info_dict['url'])
|
||||||
|
self.cookiejar.add_cookie_header(pr)
|
||||||
|
return pr.get_header('Cookie')
|
||||||
|
|
||||||
def process_video_result(self, info_dict, download=True):
|
def process_video_result(self, info_dict, download=True):
|
||||||
assert info_dict.get('_type', 'video') == 'video'
|
assert info_dict.get('_type', 'video') == 'video'
|
||||||
|
|
||||||
@ -774,12 +980,19 @@ class YoutubeDL(object):
|
|||||||
info_dict['playlist_index'] = None
|
info_dict['playlist_index'] = None
|
||||||
|
|
||||||
thumbnails = info_dict.get('thumbnails')
|
thumbnails = info_dict.get('thumbnails')
|
||||||
|
if thumbnails is None:
|
||||||
|
thumbnail = info_dict.get('thumbnail')
|
||||||
|
if thumbnail:
|
||||||
|
info_dict['thumbnails'] = thumbnails = [{'url': thumbnail}]
|
||||||
if thumbnails:
|
if thumbnails:
|
||||||
thumbnails.sort(key=lambda t: (
|
thumbnails.sort(key=lambda t: (
|
||||||
t.get('width'), t.get('height'), t.get('url')))
|
t.get('preference'), t.get('width'), t.get('height'),
|
||||||
for t in thumbnails:
|
t.get('id'), t.get('url')))
|
||||||
|
for i, t in enumerate(thumbnails):
|
||||||
if 'width' in t and 'height' in t:
|
if 'width' in t and 'height' in t:
|
||||||
t['resolution'] = '%dx%d' % (t['width'], t['height'])
|
t['resolution'] = '%dx%d' % (t['width'], t['height'])
|
||||||
|
if t.get('id') is None:
|
||||||
|
t['id'] = '%d' % i
|
||||||
|
|
||||||
if thumbnails and 'thumbnail' not in info_dict:
|
if thumbnails and 'thumbnail' not in info_dict:
|
||||||
info_dict['thumbnail'] = thumbnails[-1]['url']
|
info_dict['thumbnail'] = thumbnails[-1]['url']
|
||||||
@ -788,6 +1001,10 @@ class YoutubeDL(object):
|
|||||||
info_dict['display_id'] = info_dict['id']
|
info_dict['display_id'] = info_dict['id']
|
||||||
|
|
||||||
if info_dict.get('upload_date') is None and info_dict.get('timestamp') is not None:
|
if info_dict.get('upload_date') is None and info_dict.get('timestamp') is not None:
|
||||||
|
# Working around negative timestamps in Windows
|
||||||
|
# (see http://bugs.python.org/issue1646728)
|
||||||
|
if info_dict['timestamp'] < 0 and os.name == 'nt':
|
||||||
|
info_dict['timestamp'] = 0
|
||||||
upload_date = datetime.datetime.utcfromtimestamp(
|
upload_date = datetime.datetime.utcfromtimestamp(
|
||||||
info_dict['timestamp'])
|
info_dict['timestamp'])
|
||||||
info_dict['upload_date'] = upload_date.strftime('%Y%m%d')
|
info_dict['upload_date'] = upload_date.strftime('%Y%m%d')
|
||||||
@ -829,6 +1046,11 @@ class YoutubeDL(object):
|
|||||||
# Automatically determine file extension if missing
|
# Automatically determine file extension if missing
|
||||||
if 'ext' not in format:
|
if 'ext' not in format:
|
||||||
format['ext'] = determine_ext(format['url']).lower()
|
format['ext'] = determine_ext(format['url']).lower()
|
||||||
|
# Add HTTP headers, so that external programs can use them from the
|
||||||
|
# json output
|
||||||
|
full_format_info = info_dict.copy()
|
||||||
|
full_format_info.update(format)
|
||||||
|
format['http_headers'] = self._calc_headers(full_format_info)
|
||||||
|
|
||||||
format_limit = self.params.get('format_limit', None)
|
format_limit = self.params.get('format_limit', None)
|
||||||
if format_limit:
|
if format_limit:
|
||||||
@ -844,9 +1066,12 @@ class YoutubeDL(object):
|
|||||||
# element in the 'formats' field in info_dict is info_dict itself,
|
# element in the 'formats' field in info_dict is info_dict itself,
|
||||||
# wich can't be exported to json
|
# wich can't be exported to json
|
||||||
info_dict['formats'] = formats
|
info_dict['formats'] = formats
|
||||||
if self.params.get('listformats', None):
|
if self.params.get('listformats'):
|
||||||
self.list_formats(info_dict)
|
self.list_formats(info_dict)
|
||||||
return
|
return
|
||||||
|
if self.params.get('list_thumbnails'):
|
||||||
|
self.list_thumbnails(info_dict)
|
||||||
|
return
|
||||||
|
|
||||||
req_format = self.params.get('format')
|
req_format = self.params.get('format')
|
||||||
if req_format is None:
|
if req_format is None:
|
||||||
@ -874,10 +1099,26 @@ class YoutubeDL(object):
|
|||||||
'contain the video, try using '
|
'contain the video, try using '
|
||||||
'"-f %s+%s"' % (format_2, format_1))
|
'"-f %s+%s"' % (format_2, format_1))
|
||||||
return
|
return
|
||||||
|
output_ext = (
|
||||||
|
formats_info[0]['ext']
|
||||||
|
if self.params.get('merge_output_format') is None
|
||||||
|
else self.params['merge_output_format'])
|
||||||
selected_format = {
|
selected_format = {
|
||||||
'requested_formats': formats_info,
|
'requested_formats': formats_info,
|
||||||
'format': rf,
|
'format': '%s+%s' % (formats_info[0].get('format'),
|
||||||
'ext': formats_info[0]['ext'],
|
formats_info[1].get('format')),
|
||||||
|
'format_id': '%s+%s' % (formats_info[0].get('format_id'),
|
||||||
|
formats_info[1].get('format_id')),
|
||||||
|
'width': formats_info[0].get('width'),
|
||||||
|
'height': formats_info[0].get('height'),
|
||||||
|
'resolution': formats_info[0].get('resolution'),
|
||||||
|
'fps': formats_info[0].get('fps'),
|
||||||
|
'vcodec': formats_info[0].get('vcodec'),
|
||||||
|
'vbr': formats_info[0].get('vbr'),
|
||||||
|
'stretched_ratio': formats_info[0].get('stretched_ratio'),
|
||||||
|
'acodec': formats_info[1].get('acodec'),
|
||||||
|
'abr': formats_info[1].get('abr'),
|
||||||
|
'ext': output_ext,
|
||||||
}
|
}
|
||||||
else:
|
else:
|
||||||
selected_format = None
|
selected_format = None
|
||||||
@ -921,14 +1162,14 @@ class YoutubeDL(object):
|
|||||||
if 'format' not in info_dict:
|
if 'format' not in info_dict:
|
||||||
info_dict['format'] = info_dict['ext']
|
info_dict['format'] = info_dict['ext']
|
||||||
|
|
||||||
reason = self._match_entry(info_dict)
|
reason = self._match_entry(info_dict, incomplete=False)
|
||||||
if reason is not None:
|
if reason is not None:
|
||||||
self.to_screen('[download] ' + reason)
|
self.to_screen('[download] ' + reason)
|
||||||
return
|
return
|
||||||
|
|
||||||
self._num_downloads += 1
|
self._num_downloads += 1
|
||||||
|
|
||||||
filename = self.prepare_filename(info_dict)
|
info_dict['_filename'] = filename = self.prepare_filename(info_dict)
|
||||||
|
|
||||||
# Forced printings
|
# Forced printings
|
||||||
if self.params.get('forcetitle', False):
|
if self.params.get('forcetitle', False):
|
||||||
@ -936,8 +1177,12 @@ class YoutubeDL(object):
|
|||||||
if self.params.get('forceid', False):
|
if self.params.get('forceid', False):
|
||||||
self.to_stdout(info_dict['id'])
|
self.to_stdout(info_dict['id'])
|
||||||
if self.params.get('forceurl', False):
|
if self.params.get('forceurl', False):
|
||||||
# For RTMP URLs, also include the playpath
|
if info_dict.get('requested_formats') is not None:
|
||||||
self.to_stdout(info_dict['url'] + info_dict.get('play_path', ''))
|
for f in info_dict['requested_formats']:
|
||||||
|
self.to_stdout(f['url'] + f.get('play_path', ''))
|
||||||
|
else:
|
||||||
|
# For RTMP URLs, also include the playpath
|
||||||
|
self.to_stdout(info_dict['url'] + info_dict.get('play_path', ''))
|
||||||
if self.params.get('forcethumbnail', False) and info_dict.get('thumbnail') is not None:
|
if self.params.get('forcethumbnail', False) and info_dict.get('thumbnail') is not None:
|
||||||
self.to_stdout(info_dict['thumbnail'])
|
self.to_stdout(info_dict['thumbnail'])
|
||||||
if self.params.get('forcedescription', False) and info_dict.get('description') is not None:
|
if self.params.get('forcedescription', False) and info_dict.get('description') is not None:
|
||||||
@ -949,10 +1194,7 @@ class YoutubeDL(object):
|
|||||||
if self.params.get('forceformat', False):
|
if self.params.get('forceformat', False):
|
||||||
self.to_stdout(info_dict['format'])
|
self.to_stdout(info_dict['format'])
|
||||||
if self.params.get('forcejson', False):
|
if self.params.get('forcejson', False):
|
||||||
info_dict['_filename'] = filename
|
|
||||||
self.to_stdout(json.dumps(info_dict))
|
self.to_stdout(json.dumps(info_dict))
|
||||||
if self.params.get('dump_single_json', False):
|
|
||||||
info_dict['_filename'] = filename
|
|
||||||
|
|
||||||
# Do nothing else if in simulate mode
|
# Do nothing else if in simulate mode
|
||||||
if self.params.get('simulate', False):
|
if self.params.get('simulate', False):
|
||||||
@ -973,13 +1215,13 @@ class YoutubeDL(object):
|
|||||||
descfn = filename + '.description'
|
descfn = filename + '.description'
|
||||||
if self.params.get('nooverwrites', False) and os.path.exists(encodeFilename(descfn)):
|
if self.params.get('nooverwrites', False) and os.path.exists(encodeFilename(descfn)):
|
||||||
self.to_screen('[info] Video description is already present')
|
self.to_screen('[info] Video description is already present')
|
||||||
|
elif info_dict.get('description') is None:
|
||||||
|
self.report_warning('There\'s no description to write.')
|
||||||
else:
|
else:
|
||||||
try:
|
try:
|
||||||
self.to_screen('[info] Writing video description to: ' + descfn)
|
self.to_screen('[info] Writing video description to: ' + descfn)
|
||||||
with io.open(encodeFilename(descfn), 'w', encoding='utf-8') as descfile:
|
with io.open(encodeFilename(descfn), 'w', encoding='utf-8') as descfile:
|
||||||
descfile.write(info_dict['description'])
|
descfile.write(info_dict['description'])
|
||||||
except (KeyError, TypeError):
|
|
||||||
self.report_warning('There\'s no description to write.')
|
|
||||||
except (OSError, IOError):
|
except (OSError, IOError):
|
||||||
self.report_error('Cannot write description file ' + descfn)
|
self.report_error('Cannot write description file ' + descfn)
|
||||||
return
|
return
|
||||||
@ -1035,97 +1277,116 @@ class YoutubeDL(object):
|
|||||||
self.report_error('Cannot write metadata to JSON file ' + infofn)
|
self.report_error('Cannot write metadata to JSON file ' + infofn)
|
||||||
return
|
return
|
||||||
|
|
||||||
if self.params.get('writethumbnail', False):
|
self._write_thumbnails(info_dict, filename)
|
||||||
if info_dict.get('thumbnail') is not None:
|
|
||||||
thumb_format = determine_ext(info_dict['thumbnail'], 'jpg')
|
|
||||||
thumb_filename = os.path.splitext(filename)[0] + '.' + thumb_format
|
|
||||||
if self.params.get('nooverwrites', False) and os.path.exists(encodeFilename(thumb_filename)):
|
|
||||||
self.to_screen('[%s] %s: Thumbnail is already present' %
|
|
||||||
(info_dict['extractor'], info_dict['id']))
|
|
||||||
else:
|
|
||||||
self.to_screen('[%s] %s: Downloading thumbnail ...' %
|
|
||||||
(info_dict['extractor'], info_dict['id']))
|
|
||||||
try:
|
|
||||||
uf = self.urlopen(info_dict['thumbnail'])
|
|
||||||
with open(thumb_filename, 'wb') as thumbf:
|
|
||||||
shutil.copyfileobj(uf, thumbf)
|
|
||||||
self.to_screen('[%s] %s: Writing thumbnail to: %s' %
|
|
||||||
(info_dict['extractor'], info_dict['id'], thumb_filename))
|
|
||||||
except (compat_urllib_error.URLError, compat_http_client.HTTPException, socket.error) as err:
|
|
||||||
self.report_warning('Unable to download thumbnail "%s": %s' %
|
|
||||||
(info_dict['thumbnail'], compat_str(err)))
|
|
||||||
|
|
||||||
if not self.params.get('skip_download', False):
|
if not self.params.get('skip_download', False):
|
||||||
if self.params.get('nooverwrites', False) and os.path.exists(encodeFilename(filename)):
|
try:
|
||||||
success = True
|
def dl(name, info):
|
||||||
else:
|
fd = get_suitable_downloader(info, self.params)(self, self.params)
|
||||||
try:
|
for ph in self._progress_hooks:
|
||||||
def dl(name, info):
|
fd.add_progress_hook(ph)
|
||||||
fd = get_suitable_downloader(info)(self, self.params)
|
if self.params.get('verbose'):
|
||||||
for ph in self._progress_hooks:
|
self.to_stdout('[debug] Invoking downloader on %r' % info.get('url'))
|
||||||
fd.add_progress_hook(ph)
|
return fd.download(name, info)
|
||||||
if self.params.get('verbose'):
|
|
||||||
self.to_stdout('[debug] Invoking downloader on %r' % info.get('url'))
|
if info_dict.get('requested_formats') is not None:
|
||||||
return fd.download(name, info)
|
downloaded = []
|
||||||
if info_dict.get('requested_formats') is not None:
|
success = True
|
||||||
downloaded = []
|
merger = FFmpegMergerPP(self, not self.params.get('keepvideo'))
|
||||||
success = True
|
if not merger.available:
|
||||||
merger = FFmpegMergerPP(self, not self.params.get('keepvideo'))
|
postprocessors = []
|
||||||
if not merger._executable:
|
self.report_warning('You have requested multiple '
|
||||||
postprocessors = []
|
'formats but ffmpeg or avconv are not installed.'
|
||||||
self.report_warning('You have requested multiple '
|
' The formats won\'t be merged')
|
||||||
'formats but ffmpeg or avconv are not installed.'
|
|
||||||
' The formats won\'t be merged')
|
|
||||||
else:
|
|
||||||
postprocessors = [merger]
|
|
||||||
for f in info_dict['requested_formats']:
|
|
||||||
new_info = dict(info_dict)
|
|
||||||
new_info.update(f)
|
|
||||||
fname = self.prepare_filename(new_info)
|
|
||||||
fname = prepend_extension(fname, 'f%s' % f['format_id'])
|
|
||||||
downloaded.append(fname)
|
|
||||||
partial_success = dl(fname, new_info)
|
|
||||||
success = success and partial_success
|
|
||||||
info_dict['__postprocessors'] = postprocessors
|
|
||||||
info_dict['__files_to_merge'] = downloaded
|
|
||||||
else:
|
else:
|
||||||
parts = info_dict.get('parts', [])
|
postprocessors = [merger]
|
||||||
if not parts:
|
parts = info_dict.get('parts', [])
|
||||||
success = dl(filename, info_dict)
|
if not parts:
|
||||||
elif len(parts) == 1:
|
success = dl(filename, info_dict)
|
||||||
info_dict.update(parts[0])
|
elif len(parts) == 1:
|
||||||
success = dl(filename, info_dict)
|
info_dict.update(parts[0])
|
||||||
|
success = dl(filename, info_dict)
|
||||||
|
else:
|
||||||
|
# We check if the final video has already been downloaded
|
||||||
|
if self.params.get('continuedl', False) and os.path.isfile(encodeFilename(filename)):
|
||||||
|
self.fd.report_file_already_downloaded(filename)
|
||||||
|
success = True
|
||||||
else:
|
else:
|
||||||
# We check if the final video has already been downloaded
|
parts_success = []
|
||||||
if self.params.get('continuedl', False) and os.path.isfile(encodeFilename(filename)):
|
self.to_screen(u'[info] Downloading %s parts' % len(parts))
|
||||||
self.fd.report_file_already_downloaded(filename)
|
for (i, part) in enumerate(parts):
|
||||||
success = True
|
part_info = dict(info_dict)
|
||||||
else:
|
part_info.update(part)
|
||||||
parts_success = []
|
part_filename = build_part_filename(filename, i)
|
||||||
self.to_screen(u'[info] Downloading %s parts' % len(parts))
|
parts_success.append(dl(part_filename, part_info))
|
||||||
for (i, part) in enumerate(parts):
|
success = all(parts_success)
|
||||||
part_info = dict(info_dict)
|
for f in info_dict['requested_formats']:
|
||||||
part_info.update(part)
|
new_info = dict(info_dict)
|
||||||
part_filename = build_part_filename(filename, i)
|
new_info.update(f)
|
||||||
parts_success.append(dl(part_filename, part_info))
|
fname = self.prepare_filename(new_info)
|
||||||
success = all(parts_success)
|
fname = prepend_extension(fname, 'f%s' % f['format_id'])
|
||||||
except (compat_urllib_error.URLError, compat_http_client.HTTPException, socket.error) as err:
|
downloaded.append(fname)
|
||||||
self.report_error('unable to download video data: %s' % str(err))
|
partial_success = dl(fname, new_info)
|
||||||
return
|
success = success and partial_success
|
||||||
except (OSError, IOError) as err:
|
info_dict['__postprocessors'] = postprocessors
|
||||||
raise UnavailableVideoError(err)
|
info_dict['__files_to_merge'] = downloaded
|
||||||
except (ContentTooShortError, ) as err:
|
else:
|
||||||
self.report_error('content too short (expected %s bytes and served %s)' % (err.expected, err.downloaded))
|
# Just a single file
|
||||||
return
|
success = dl(filename, info_dict)
|
||||||
|
except (compat_urllib_error.URLError, compat_http_client.HTTPException, socket.error) as err:
|
||||||
|
self.report_error('unable to download video data: %s' % str(err))
|
||||||
|
return
|
||||||
|
except (OSError, IOError) as err:
|
||||||
|
raise UnavailableVideoError(err)
|
||||||
|
except (ContentTooShortError, ) as err:
|
||||||
|
self.report_error('content too short (expected %s bytes and served %s)' % (err.expected, err.downloaded))
|
||||||
|
return
|
||||||
|
|
||||||
if success:
|
if success:
|
||||||
|
# Fixup content
|
||||||
|
fixup_policy = self.params.get('fixup')
|
||||||
|
if fixup_policy is None:
|
||||||
|
fixup_policy = 'detect_or_warn'
|
||||||
|
|
||||||
|
stretched_ratio = info_dict.get('stretched_ratio')
|
||||||
|
if stretched_ratio is not None and stretched_ratio != 1:
|
||||||
|
if fixup_policy == 'warn':
|
||||||
|
self.report_warning('%s: Non-uniform pixel ratio (%s)' % (
|
||||||
|
info_dict['id'], stretched_ratio))
|
||||||
|
elif fixup_policy == 'detect_or_warn':
|
||||||
|
stretched_pp = FFmpegFixupStretchedPP(self)
|
||||||
|
if stretched_pp.available:
|
||||||
|
info_dict.setdefault('__postprocessors', [])
|
||||||
|
info_dict['__postprocessors'].append(stretched_pp)
|
||||||
|
else:
|
||||||
|
self.report_warning(
|
||||||
|
'%s: Non-uniform pixel ratio (%s). Install ffmpeg or avconv to fix this automatically.' % (
|
||||||
|
info_dict['id'], stretched_ratio))
|
||||||
|
else:
|
||||||
|
assert fixup_policy in ('ignore', 'never')
|
||||||
|
|
||||||
|
if info_dict.get('requested_formats') is None and info_dict.get('container') == 'm4a_dash':
|
||||||
|
if fixup_policy == 'warn':
|
||||||
|
self.report_warning('%s: writing DASH m4a. Only some players support this container.' % (
|
||||||
|
info_dict['id']))
|
||||||
|
elif fixup_policy == 'detect_or_warn':
|
||||||
|
fixup_pp = FFmpegFixupM4aPP(self)
|
||||||
|
if fixup_pp.available:
|
||||||
|
info_dict.setdefault('__postprocessors', [])
|
||||||
|
info_dict['__postprocessors'].append(fixup_pp)
|
||||||
|
else:
|
||||||
|
self.report_warning(
|
||||||
|
'%s: writing DASH m4a. Only some players support this container. Install ffmpeg or avconv to fix this automatically.' % (
|
||||||
|
info_dict['id']))
|
||||||
|
else:
|
||||||
|
assert fixup_policy in ('ignore', 'never')
|
||||||
|
|
||||||
try:
|
try:
|
||||||
self.post_process(filename, info_dict)
|
self.post_process(filename, info_dict)
|
||||||
except (PostProcessingError) as err:
|
except (PostProcessingError) as err:
|
||||||
self.report_error('postprocessing: %s' % str(err))
|
self.report_error('postprocessing: %s' % str(err))
|
||||||
return
|
return
|
||||||
|
self.record_download_archive(info_dict)
|
||||||
self.record_download_archive(info_dict)
|
|
||||||
|
|
||||||
def download(self, url_list):
|
def download(self, url_list):
|
||||||
"""Download a given list of URLs."""
|
"""Download a given list of URLs."""
|
||||||
@ -1168,14 +1429,15 @@ class YoutubeDL(object):
|
|||||||
"""Run all the postprocessors on the given file."""
|
"""Run all the postprocessors on the given file."""
|
||||||
info = dict(ie_info)
|
info = dict(ie_info)
|
||||||
info['filepath'] = filename
|
info['filepath'] = filename
|
||||||
keep_video = None
|
|
||||||
pps_chain = []
|
pps_chain = []
|
||||||
if ie_info.get('__postprocessors') is not None:
|
if ie_info.get('__postprocessors') is not None:
|
||||||
pps_chain.extend(ie_info['__postprocessors'])
|
pps_chain.extend(ie_info['__postprocessors'])
|
||||||
pps_chain.extend(self._pps)
|
pps_chain.extend(self._pps)
|
||||||
for pp in pps_chain:
|
for pp in pps_chain:
|
||||||
|
keep_video = None
|
||||||
|
old_filename = info['filepath']
|
||||||
try:
|
try:
|
||||||
keep_video_wish, new_info = pp.run(info)
|
keep_video_wish, info = pp.run(info)
|
||||||
if keep_video_wish is not None:
|
if keep_video_wish is not None:
|
||||||
if keep_video_wish:
|
if keep_video_wish:
|
||||||
keep_video = keep_video_wish
|
keep_video = keep_video_wish
|
||||||
@ -1184,12 +1446,12 @@ class YoutubeDL(object):
|
|||||||
keep_video = keep_video_wish
|
keep_video = keep_video_wish
|
||||||
except PostProcessingError as e:
|
except PostProcessingError as e:
|
||||||
self.report_error(e.msg)
|
self.report_error(e.msg)
|
||||||
if keep_video is False and not self.params.get('keepvideo', False):
|
if keep_video is False and not self.params.get('keepvideo', False):
|
||||||
try:
|
try:
|
||||||
self.to_screen('Deleting original file %s (pass -k to keep)' % filename)
|
self.to_screen('Deleting original file %s (pass -k to keep)' % old_filename)
|
||||||
os.remove(encodeFilename(filename))
|
os.remove(encodeFilename(old_filename))
|
||||||
except (IOError, OSError):
|
except (IOError, OSError):
|
||||||
self.report_warning('Unable to remove downloaded video file')
|
self.report_warning('Unable to remove downloaded video file')
|
||||||
|
|
||||||
def _make_archive_id(self, info_dict):
|
def _make_archive_id(self, info_dict):
|
||||||
# Future-proof against any change in case
|
# Future-proof against any change in case
|
||||||
@ -1298,27 +1560,35 @@ class YoutubeDL(object):
|
|||||||
return res
|
return res
|
||||||
|
|
||||||
def list_formats(self, info_dict):
|
def list_formats(self, info_dict):
|
||||||
def line(format, idlen=20):
|
|
||||||
return (('%-' + compat_str(idlen + 1) + 's%-10s%-12s%s') % (
|
|
||||||
format['format_id'],
|
|
||||||
format['ext'],
|
|
||||||
self.format_resolution(format),
|
|
||||||
self._format_note(format),
|
|
||||||
))
|
|
||||||
|
|
||||||
formats = info_dict.get('formats', [info_dict])
|
formats = info_dict.get('formats', [info_dict])
|
||||||
idlen = max(len('format code'),
|
table = [
|
||||||
max(len(f['format_id']) for f in formats))
|
[f['format_id'], f['ext'], self.format_resolution(f), self._format_note(f)]
|
||||||
formats_s = [line(f, idlen) for f in formats]
|
for f in formats
|
||||||
|
if f.get('preference') is None or f['preference'] >= -1000]
|
||||||
if len(formats) > 1:
|
if len(formats) > 1:
|
||||||
formats_s[0] += (' ' if self._format_note(formats[0]) else '') + '(worst)'
|
table[-1][-1] += (' ' if table[-1][-1] else '') + '(best)'
|
||||||
formats_s[-1] += (' ' if self._format_note(formats[-1]) else '') + '(best)'
|
|
||||||
|
|
||||||
header_line = line({
|
header_line = ['format code', 'extension', 'resolution', 'note']
|
||||||
'format_id': 'format code', 'ext': 'extension',
|
self.to_screen(
|
||||||
'resolution': 'resolution', 'format_note': 'note'}, idlen=idlen)
|
'[info] Available formats for %s:\n%s' %
|
||||||
self.to_screen('[info] Available formats for %s:\n%s\n%s' %
|
(info_dict['id'], render_table(header_line, table)))
|
||||||
(info_dict['id'], header_line, '\n'.join(formats_s)))
|
|
||||||
|
def list_thumbnails(self, info_dict):
|
||||||
|
thumbnails = info_dict.get('thumbnails')
|
||||||
|
if not thumbnails:
|
||||||
|
tn_url = info_dict.get('thumbnail')
|
||||||
|
if tn_url:
|
||||||
|
thumbnails = [{'id': '0', 'url': tn_url}]
|
||||||
|
else:
|
||||||
|
self.to_screen(
|
||||||
|
'[info] No thumbnails present for %s' % info_dict['id'])
|
||||||
|
return
|
||||||
|
|
||||||
|
self.to_screen(
|
||||||
|
'[info] Thumbnails for %s:' % info_dict['id'])
|
||||||
|
self.to_screen(render_table(
|
||||||
|
['ID', 'width', 'height', 'URL'],
|
||||||
|
[[t['id'], t.get('width', 'unknown'), t.get('height', 'unknown'), t['url']] for t in thumbnails]))
|
||||||
|
|
||||||
def urlopen(self, req):
|
def urlopen(self, req):
|
||||||
""" Start an HTTP download """
|
""" Start an HTTP download """
|
||||||
@ -1329,7 +1599,7 @@ class YoutubeDL(object):
|
|||||||
# urllib chokes on URLs with non-ASCII characters (see http://bugs.python.org/issue3991)
|
# urllib chokes on URLs with non-ASCII characters (see http://bugs.python.org/issue3991)
|
||||||
# To work around aforementioned issue we will replace request's original URL with
|
# To work around aforementioned issue we will replace request's original URL with
|
||||||
# percent-encoded one
|
# percent-encoded one
|
||||||
req_is_string = isinstance(req, basestring if sys.version_info < (3, 0) else compat_str)
|
req_is_string = isinstance(req, compat_basestring)
|
||||||
url = req if req_is_string else req.get_full_url()
|
url = req if req_is_string else req.get_full_url()
|
||||||
url_escaped = escape_url(url)
|
url_escaped = escape_url(url)
|
||||||
|
|
||||||
@ -1381,7 +1651,7 @@ class YoutubeDL(object):
|
|||||||
self._write_string('[debug] Python version %s - %s\n' % (
|
self._write_string('[debug] Python version %s - %s\n' % (
|
||||||
platform.python_version(), platform_name()))
|
platform.python_version(), platform_name()))
|
||||||
|
|
||||||
exe_versions = FFmpegPostProcessor.get_versions()
|
exe_versions = FFmpegPostProcessor.get_versions(self)
|
||||||
exe_versions['rtmpdump'] = rtmpdump_version()
|
exe_versions['rtmpdump'] = rtmpdump_version()
|
||||||
exe_str = ', '.join(
|
exe_str = ', '.join(
|
||||||
'%s %s' % (exe, v)
|
'%s %s' % (exe, v)
|
||||||
@ -1398,6 +1668,17 @@ class YoutubeDL(object):
|
|||||||
proxy_map.update(handler.proxies)
|
proxy_map.update(handler.proxies)
|
||||||
self._write_string('[debug] Proxy map: ' + compat_str(proxy_map) + '\n')
|
self._write_string('[debug] Proxy map: ' + compat_str(proxy_map) + '\n')
|
||||||
|
|
||||||
|
if self.params.get('call_home', False):
|
||||||
|
ipaddr = self.urlopen('https://yt-dl.org/ip').read().decode('utf-8')
|
||||||
|
self._write_string('[debug] Public IP address: %s\n' % ipaddr)
|
||||||
|
latest_version = self.urlopen(
|
||||||
|
'https://yt-dl.org/latest/version').read().decode('utf-8')
|
||||||
|
if version_tuple(latest_version) > version_tuple(__version__):
|
||||||
|
self.report_warning(
|
||||||
|
'You are using an outdated version (newest version: %s)! '
|
||||||
|
'See https://yt-dl.org/update if you need help updating.' %
|
||||||
|
latest_version)
|
||||||
|
|
||||||
def _setup_opener(self):
|
def _setup_opener(self):
|
||||||
timeout_val = self.params.get('socket_timeout')
|
timeout_val = self.params.get('socket_timeout')
|
||||||
self._socket_timeout = 600 if timeout_val is None else float(timeout_val)
|
self._socket_timeout = 600 if timeout_val is None else float(timeout_val)
|
||||||
@ -1428,9 +1709,8 @@ class YoutubeDL(object):
|
|||||||
proxy_handler = compat_urllib_request.ProxyHandler(proxies)
|
proxy_handler = compat_urllib_request.ProxyHandler(proxies)
|
||||||
|
|
||||||
debuglevel = 1 if self.params.get('debug_printtraffic') else 0
|
debuglevel = 1 if self.params.get('debug_printtraffic') else 0
|
||||||
https_handler = make_HTTPS_handler(
|
https_handler = make_HTTPS_handler(self.params, debuglevel=debuglevel)
|
||||||
self.params.get('nocheckcertificate', False), debuglevel=debuglevel)
|
ydlh = YoutubeDLHandler(self.params, debuglevel=debuglevel)
|
||||||
ydlh = YoutubeDLHandler(debuglevel=debuglevel)
|
|
||||||
opener = compat_urllib_request.build_opener(
|
opener = compat_urllib_request.build_opener(
|
||||||
https_handler, proxy_handler, cookie_processor, ydlh)
|
https_handler, proxy_handler, cookie_processor, ydlh)
|
||||||
# Delete the default user-agent header, which would otherwise apply in
|
# Delete the default user-agent header, which would otherwise apply in
|
||||||
@ -1454,3 +1734,39 @@ class YoutubeDL(object):
|
|||||||
if encoding is None:
|
if encoding is None:
|
||||||
encoding = preferredencoding()
|
encoding = preferredencoding()
|
||||||
return encoding
|
return encoding
|
||||||
|
|
||||||
|
def _write_thumbnails(self, info_dict, filename):
|
||||||
|
if self.params.get('writethumbnail', False):
|
||||||
|
thumbnails = info_dict.get('thumbnails')
|
||||||
|
if thumbnails:
|
||||||
|
thumbnails = [thumbnails[-1]]
|
||||||
|
elif self.params.get('write_all_thumbnails', False):
|
||||||
|
thumbnails = info_dict.get('thumbnails')
|
||||||
|
else:
|
||||||
|
return
|
||||||
|
|
||||||
|
if not thumbnails:
|
||||||
|
# No thumbnails present, so return immediately
|
||||||
|
return
|
||||||
|
|
||||||
|
for t in thumbnails:
|
||||||
|
thumb_ext = determine_ext(t['url'], 'jpg')
|
||||||
|
suffix = '_%s' % t['id'] if len(thumbnails) > 1 else ''
|
||||||
|
thumb_display_id = '%s ' % t['id'] if len(thumbnails) > 1 else ''
|
||||||
|
thumb_filename = os.path.splitext(filename)[0] + suffix + '.' + thumb_ext
|
||||||
|
|
||||||
|
if self.params.get('nooverwrites', False) and os.path.exists(encodeFilename(thumb_filename)):
|
||||||
|
self.to_screen('[%s] %s: Thumbnail %sis already present' %
|
||||||
|
(info_dict['extractor'], info_dict['id'], thumb_display_id))
|
||||||
|
else:
|
||||||
|
self.to_screen('[%s] %s: Downloading thumbnail %s...' %
|
||||||
|
(info_dict['extractor'], info_dict['id'], thumb_display_id))
|
||||||
|
try:
|
||||||
|
uf = self.urlopen(t['url'])
|
||||||
|
with open(thumb_filename, 'wb') as thumbf:
|
||||||
|
shutil.copyfileobj(uf, thumbf)
|
||||||
|
self.to_screen('[%s] %s: Writing thumbnail %sto: %s' %
|
||||||
|
(info_dict['extractor'], info_dict['id'], thumb_display_id, thumb_filename))
|
||||||
|
except (compat_urllib_error.URLError, compat_http_client.HTTPException, socket.error) as err:
|
||||||
|
self.report_warning('Unable to download thumbnail "%s": %s' %
|
||||||
|
(t['url'], compat_str(err)))
|
||||||
|
@ -23,9 +23,10 @@ from .compat import (
|
|||||||
)
|
)
|
||||||
from .utils import (
|
from .utils import (
|
||||||
DateRange,
|
DateRange,
|
||||||
DEFAULT_OUTTMPL,
|
|
||||||
decodeOption,
|
decodeOption,
|
||||||
|
DEFAULT_OUTTMPL,
|
||||||
DownloadError,
|
DownloadError,
|
||||||
|
match_filter_func,
|
||||||
MaxDownloadsReached,
|
MaxDownloadsReached,
|
||||||
preferredencoding,
|
preferredencoding,
|
||||||
read_batch_urls,
|
read_batch_urls,
|
||||||
@ -38,19 +39,8 @@ from .update import update_self
|
|||||||
from .downloader import (
|
from .downloader import (
|
||||||
FileDownloader,
|
FileDownloader,
|
||||||
)
|
)
|
||||||
from .extractor import gen_extractors
|
from .extractor import gen_extractors, list_extractors
|
||||||
from .YoutubeDL import YoutubeDL
|
from .YoutubeDL import YoutubeDL
|
||||||
from .postprocessor import (
|
|
||||||
AtomicParsleyPP,
|
|
||||||
FFmpegAudioFixPP,
|
|
||||||
FFmpegMetadataPP,
|
|
||||||
FFmpegVideoConvertor,
|
|
||||||
FFmpegExtractAudioPP,
|
|
||||||
FFmpegEmbedSubtitlePP,
|
|
||||||
XAttrMetadataPP,
|
|
||||||
FFmpegJoinVideosPP,
|
|
||||||
ExecAfterDownloadPP,
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
def _real_main(argv=None):
|
def _real_main(argv=None):
|
||||||
@ -106,24 +96,22 @@ def _real_main(argv=None):
|
|||||||
_enc = preferredencoding()
|
_enc = preferredencoding()
|
||||||
all_urls = [url.decode(_enc, 'ignore') if isinstance(url, bytes) else url for url in all_urls]
|
all_urls = [url.decode(_enc, 'ignore') if isinstance(url, bytes) else url for url in all_urls]
|
||||||
|
|
||||||
extractors = gen_extractors()
|
|
||||||
|
|
||||||
if opts.list_extractors:
|
if opts.list_extractors:
|
||||||
for ie in sorted(extractors, key=lambda ie: ie.IE_NAME.lower()):
|
for ie in list_extractors(opts.age_limit):
|
||||||
compat_print(ie.IE_NAME + (' (CURRENTLY BROKEN)' if not ie._WORKING else ''))
|
compat_print(ie.IE_NAME + (' (CURRENTLY BROKEN)' if not ie._WORKING else ''))
|
||||||
matchedUrls = [url for url in all_urls if ie.suitable(url)]
|
matchedUrls = [url for url in all_urls if ie.suitable(url)]
|
||||||
for mu in matchedUrls:
|
for mu in matchedUrls:
|
||||||
compat_print(' ' + mu)
|
compat_print(' ' + mu)
|
||||||
sys.exit(0)
|
sys.exit(0)
|
||||||
if opts.list_extractor_descriptions:
|
if opts.list_extractor_descriptions:
|
||||||
for ie in sorted(extractors, key=lambda ie: ie.IE_NAME.lower()):
|
for ie in list_extractors(opts.age_limit):
|
||||||
if not ie._WORKING:
|
if not ie._WORKING:
|
||||||
continue
|
continue
|
||||||
desc = getattr(ie, 'IE_DESC', ie.IE_NAME)
|
desc = getattr(ie, 'IE_DESC', ie.IE_NAME)
|
||||||
if desc is False:
|
if desc is False:
|
||||||
continue
|
continue
|
||||||
if hasattr(ie, 'SEARCH_KEY'):
|
if hasattr(ie, 'SEARCH_KEY'):
|
||||||
_SEARCHES = ('cute kittens', 'slithering pythons', 'falling cat', 'angry poodle', 'purple fish', 'running tortoise', 'sleeping bunny')
|
_SEARCHES = ('cute kittens', 'slithering pythons', 'falling cat', 'angry poodle', 'purple fish', 'running tortoise', 'sleeping bunny', 'burping cow')
|
||||||
_COUNTS = ('', '5', '10', 'all')
|
_COUNTS = ('', '5', '10', 'all')
|
||||||
desc += ' (Example: "%s%s:%s" )' % (ie.SEARCH_KEY, random.choice(_COUNTS), random.choice(_SEARCHES))
|
desc += ' (Example: "%s%s:%s" )' % (ie.SEARCH_KEY, random.choice(_COUNTS), random.choice(_SEARCHES))
|
||||||
compat_print(desc)
|
compat_print(desc)
|
||||||
@ -156,10 +144,13 @@ def _real_main(argv=None):
|
|||||||
parser.error('invalid max_filesize specified')
|
parser.error('invalid max_filesize specified')
|
||||||
opts.max_filesize = numeric_limit
|
opts.max_filesize = numeric_limit
|
||||||
if opts.retries is not None:
|
if opts.retries is not None:
|
||||||
try:
|
if opts.retries in ('inf', 'infinite'):
|
||||||
opts.retries = int(opts.retries)
|
opts_retries = float('inf')
|
||||||
except (TypeError, ValueError):
|
else:
|
||||||
parser.error('invalid retry count specified')
|
try:
|
||||||
|
opts_retries = int(opts.retries)
|
||||||
|
except (TypeError, ValueError):
|
||||||
|
parser.error('invalid retry count specified')
|
||||||
if opts.buffersize is not None:
|
if opts.buffersize is not None:
|
||||||
numeric_buffersize = FileDownloader.parse_bytes(opts.buffersize)
|
numeric_buffersize = FileDownloader.parse_bytes(opts.buffersize)
|
||||||
if numeric_buffersize is None:
|
if numeric_buffersize is None:
|
||||||
@ -179,6 +170,7 @@ def _real_main(argv=None):
|
|||||||
if opts.recodevideo is not None:
|
if opts.recodevideo is not None:
|
||||||
if opts.recodevideo not in ['mp4', 'flv', 'webm', 'ogg', 'mkv']:
|
if opts.recodevideo not in ['mp4', 'flv', 'webm', 'ogg', 'mkv']:
|
||||||
parser.error('invalid video recode format specified')
|
parser.error('invalid video recode format specified')
|
||||||
|
|
||||||
if opts.date is not None:
|
if opts.date is not None:
|
||||||
date = DateRange.day(opts.date)
|
date = DateRange.day(opts.date)
|
||||||
else:
|
else:
|
||||||
@ -210,16 +202,63 @@ def _real_main(argv=None):
|
|||||||
' file! Use "{0}.%(ext)s" instead of "{0}" as the output'
|
' file! Use "{0}.%(ext)s" instead of "{0}" as the output'
|
||||||
' template'.format(outtmpl))
|
' template'.format(outtmpl))
|
||||||
|
|
||||||
any_printing = opts.geturl or opts.gettitle or opts.getid or opts.getthumbnail or opts.getdescription or opts.getfilename or opts.getformat or opts.getduration or opts.dumpjson or opts.dump_single_json
|
any_getting = opts.geturl or opts.gettitle or opts.getid or opts.getthumbnail or opts.getdescription or opts.getfilename or opts.getformat or opts.getduration or opts.dumpjson or opts.dump_single_json
|
||||||
|
any_printing = opts.print_json
|
||||||
download_archive_fn = compat_expanduser(opts.download_archive) if opts.download_archive is not None else opts.download_archive
|
download_archive_fn = compat_expanduser(opts.download_archive) if opts.download_archive is not None else opts.download_archive
|
||||||
|
|
||||||
|
# PostProcessors
|
||||||
|
postprocessors = []
|
||||||
|
# Add the metadata pp first, the other pps will copy it
|
||||||
|
if opts.addmetadata:
|
||||||
|
postprocessors.append({'key': 'FFmpegMetadata'})
|
||||||
|
if opts.extractaudio:
|
||||||
|
postprocessors.append({
|
||||||
|
'key': 'FFmpegExtractAudio',
|
||||||
|
'preferredcodec': opts.audioformat,
|
||||||
|
'preferredquality': opts.audioquality,
|
||||||
|
'nopostoverwrites': opts.nopostoverwrites,
|
||||||
|
})
|
||||||
|
if opts.recodevideo:
|
||||||
|
postprocessors.append({
|
||||||
|
'key': 'FFmpegVideoConvertor',
|
||||||
|
'preferedformat': opts.recodevideo,
|
||||||
|
})
|
||||||
|
if opts.embedsubtitles:
|
||||||
|
postprocessors.append({
|
||||||
|
'key': 'FFmpegEmbedSubtitle',
|
||||||
|
'subtitlesformat': opts.subtitlesformat,
|
||||||
|
})
|
||||||
|
if opts.xattrs:
|
||||||
|
postprocessors.append({'key': 'XAttrMetadata'})
|
||||||
|
if opts.embedthumbnail:
|
||||||
|
if not opts.addmetadata:
|
||||||
|
postprocessors.append({'key': 'FFmpegAudioFix'})
|
||||||
|
postprocessors.append({'key': 'AtomicParsley'})
|
||||||
|
# Please keep ExecAfterDownload towards the bottom as it allows the user to modify the final file in any way.
|
||||||
|
# So if the user is able to remove the file before your postprocessor runs it might cause a few problems.
|
||||||
|
if opts.exec_cmd:
|
||||||
|
postprocessors.append({
|
||||||
|
'key': 'ExecAfterDownload',
|
||||||
|
'verboseOutput': opts.verbose,
|
||||||
|
'exec_cmd': opts.exec_cmd,
|
||||||
|
})
|
||||||
|
if opts.xattr_set_filesize:
|
||||||
|
try:
|
||||||
|
import xattr
|
||||||
|
xattr # Confuse flake8
|
||||||
|
except ImportError:
|
||||||
|
parser.error('setting filesize xattr requested but python-xattr is not available')
|
||||||
|
match_filter = (
|
||||||
|
None if opts.match_filter is None
|
||||||
|
else match_filter_func(opts.match_filter))
|
||||||
|
|
||||||
ydl_opts = {
|
ydl_opts = {
|
||||||
'usenetrc': opts.usenetrc,
|
'usenetrc': opts.usenetrc,
|
||||||
'username': opts.username,
|
'username': opts.username,
|
||||||
'password': opts.password,
|
'password': opts.password,
|
||||||
'twofactor': opts.twofactor,
|
'twofactor': opts.twofactor,
|
||||||
'videopassword': opts.videopassword,
|
'videopassword': opts.videopassword,
|
||||||
'quiet': (opts.quiet or any_printing),
|
'quiet': (opts.quiet or any_getting or any_printing),
|
||||||
'no_warnings': opts.no_warnings,
|
'no_warnings': opts.no_warnings,
|
||||||
'forceurl': opts.geturl,
|
'forceurl': opts.geturl,
|
||||||
'forcetitle': opts.gettitle,
|
'forcetitle': opts.gettitle,
|
||||||
@ -229,9 +268,9 @@ def _real_main(argv=None):
|
|||||||
'forceduration': opts.getduration,
|
'forceduration': opts.getduration,
|
||||||
'forcefilename': opts.getfilename,
|
'forcefilename': opts.getfilename,
|
||||||
'forceformat': opts.getformat,
|
'forceformat': opts.getformat,
|
||||||
'forcejson': opts.dumpjson,
|
'forcejson': opts.dumpjson or opts.print_json,
|
||||||
'dump_single_json': opts.dump_single_json,
|
'dump_single_json': opts.dump_single_json,
|
||||||
'simulate': opts.simulate or any_printing,
|
'simulate': opts.simulate or any_getting,
|
||||||
'skip_download': opts.skip_download,
|
'skip_download': opts.skip_download,
|
||||||
'format': opts.format,
|
'format': opts.format,
|
||||||
'format_limit': opts.format_limit,
|
'format_limit': opts.format_limit,
|
||||||
@ -242,7 +281,7 @@ def _real_main(argv=None):
|
|||||||
'ignoreerrors': opts.ignoreerrors,
|
'ignoreerrors': opts.ignoreerrors,
|
||||||
'ratelimit': opts.ratelimit,
|
'ratelimit': opts.ratelimit,
|
||||||
'nooverwrites': opts.nooverwrites,
|
'nooverwrites': opts.nooverwrites,
|
||||||
'retries': opts.retries,
|
'retries': opts_retries,
|
||||||
'buffersize': opts.buffersize,
|
'buffersize': opts.buffersize,
|
||||||
'noresizebuffer': opts.noresizebuffer,
|
'noresizebuffer': opts.noresizebuffer,
|
||||||
'continuedl': opts.continue_dl,
|
'continuedl': opts.continue_dl,
|
||||||
@ -250,6 +289,7 @@ def _real_main(argv=None):
|
|||||||
'progress_with_newline': opts.progress_with_newline,
|
'progress_with_newline': opts.progress_with_newline,
|
||||||
'playliststart': opts.playliststart,
|
'playliststart': opts.playliststart,
|
||||||
'playlistend': opts.playlistend,
|
'playlistend': opts.playlistend,
|
||||||
|
'playlistreverse': opts.playlist_reverse,
|
||||||
'noplaylist': opts.noplaylist,
|
'noplaylist': opts.noplaylist,
|
||||||
'joinparts': opts.joinparts,
|
'joinparts': opts.joinparts,
|
||||||
'logtostderr': opts.outtmpl == '-',
|
'logtostderr': opts.outtmpl == '-',
|
||||||
@ -260,6 +300,7 @@ def _real_main(argv=None):
|
|||||||
'writeannotations': opts.writeannotations,
|
'writeannotations': opts.writeannotations,
|
||||||
'writeinfojson': opts.writeinfojson,
|
'writeinfojson': opts.writeinfojson,
|
||||||
'writethumbnail': opts.writethumbnail,
|
'writethumbnail': opts.writethumbnail,
|
||||||
|
'write_all_thumbnails': opts.write_all_thumbnails,
|
||||||
'writesubtitles': opts.writesubtitles,
|
'writesubtitles': opts.writesubtitles,
|
||||||
'writeautomaticsub': opts.writeautomaticsub,
|
'writeautomaticsub': opts.writeautomaticsub,
|
||||||
'allsubtitles': opts.allsubtitles,
|
'allsubtitles': opts.allsubtitles,
|
||||||
@ -298,34 +339,23 @@ def _real_main(argv=None):
|
|||||||
'encoding': opts.encoding,
|
'encoding': opts.encoding,
|
||||||
'exec_cmd': opts.exec_cmd,
|
'exec_cmd': opts.exec_cmd,
|
||||||
'extract_flat': opts.extract_flat,
|
'extract_flat': opts.extract_flat,
|
||||||
|
'merge_output_format': opts.merge_output_format,
|
||||||
|
'postprocessors': postprocessors,
|
||||||
|
'fixup': opts.fixup,
|
||||||
|
'source_address': opts.source_address,
|
||||||
|
'call_home': opts.call_home,
|
||||||
|
'sleep_interval': opts.sleep_interval,
|
||||||
|
'external_downloader': opts.external_downloader,
|
||||||
|
'list_thumbnails': opts.list_thumbnails,
|
||||||
|
'playlist_items': opts.playlist_items,
|
||||||
|
'xattr_set_filesize': opts.xattr_set_filesize,
|
||||||
|
'match_filter': match_filter,
|
||||||
|
'no_color': opts.no_color,
|
||||||
|
'ffmpeg_location': opts.ffmpeg_location,
|
||||||
|
'hls_prefer_native': opts.hls_prefer_native,
|
||||||
}
|
}
|
||||||
|
|
||||||
with YoutubeDL(ydl_opts) as ydl:
|
with YoutubeDL(ydl_opts) as ydl:
|
||||||
# PostProcessors
|
|
||||||
if opts.joinparts:
|
|
||||||
ydl.add_post_processor(FFmpegJoinVideosPP())
|
|
||||||
# Add the metadata pp first, the other pps will copy it
|
|
||||||
if opts.addmetadata:
|
|
||||||
ydl.add_post_processor(FFmpegMetadataPP())
|
|
||||||
if opts.extractaudio:
|
|
||||||
ydl.add_post_processor(FFmpegExtractAudioPP(preferredcodec=opts.audioformat, preferredquality=opts.audioquality, nopostoverwrites=opts.nopostoverwrites))
|
|
||||||
if opts.recodevideo:
|
|
||||||
ydl.add_post_processor(FFmpegVideoConvertor(preferedformat=opts.recodevideo))
|
|
||||||
if opts.embedsubtitles:
|
|
||||||
ydl.add_post_processor(FFmpegEmbedSubtitlePP(subtitlesformat=opts.subtitlesformat))
|
|
||||||
if opts.xattrs:
|
|
||||||
ydl.add_post_processor(XAttrMetadataPP())
|
|
||||||
if opts.embedthumbnail:
|
|
||||||
if not opts.addmetadata:
|
|
||||||
ydl.add_post_processor(FFmpegAudioFixPP())
|
|
||||||
ydl.add_post_processor(AtomicParsleyPP())
|
|
||||||
|
|
||||||
# Please keep ExecAfterDownload towards the bottom as it allows the user to modify the final file in any way.
|
|
||||||
# So if the user is able to remove the file before your postprocessor runs it might cause a few problems.
|
|
||||||
if opts.exec_cmd:
|
|
||||||
ydl.add_post_processor(ExecAfterDownloadPP(
|
|
||||||
verboseOutput=opts.verbose, exec_cmd=opts.exec_cmd))
|
|
||||||
|
|
||||||
# Update version
|
# Update version
|
||||||
if opts.update_self:
|
if opts.update_self:
|
||||||
update_self(ydl.to_screen, opts.verbose)
|
update_self(ydl.to_screen, opts.verbose)
|
||||||
@ -340,7 +370,9 @@ def _real_main(argv=None):
|
|||||||
sys.exit()
|
sys.exit()
|
||||||
|
|
||||||
ydl.warn_if_short_id(sys.argv[1:] if argv is None else argv)
|
ydl.warn_if_short_id(sys.argv[1:] if argv is None else argv)
|
||||||
parser.error('you must provide at least one URL')
|
parser.error(
|
||||||
|
'You must provide at least one URL.\n'
|
||||||
|
'Type youtube-dl --help to see a list of all options.')
|
||||||
|
|
||||||
try:
|
try:
|
||||||
if opts.load_info_filename is not None:
|
if opts.load_info_filename is not None:
|
||||||
@ -363,3 +395,5 @@ def main(argv=None):
|
|||||||
sys.exit('ERROR: fixed output name but more than one file to download')
|
sys.exit('ERROR: fixed output name but more than one file to download')
|
||||||
except KeyboardInterrupt:
|
except KeyboardInterrupt:
|
||||||
sys.exit('\nERROR: Interrupted by user')
|
sys.exit('\nERROR: Interrupted by user')
|
||||||
|
|
||||||
|
__all__ = ['main', 'YoutubeDL', 'gen_extractors', 'list_extractors']
|
||||||
|
@ -1,7 +1,5 @@
|
|||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
__all__ = ['aes_encrypt', 'key_expansion', 'aes_ctr_decrypt', 'aes_cbc_decrypt', 'aes_decrypt_text']
|
|
||||||
|
|
||||||
import base64
|
import base64
|
||||||
from math import ceil
|
from math import ceil
|
||||||
|
|
||||||
@ -329,3 +327,5 @@ def inc(data):
|
|||||||
data[i] = data[i] + 1
|
data[i] = data[i] + 1
|
||||||
break
|
break
|
||||||
return data
|
return data
|
||||||
|
|
||||||
|
__all__ = ['aes_encrypt', 'key_expansion', 'aes_ctr_decrypt', 'aes_cbc_decrypt', 'aes_decrypt_text']
|
||||||
|
@ -4,6 +4,7 @@ import getpass
|
|||||||
import optparse
|
import optparse
|
||||||
import os
|
import os
|
||||||
import re
|
import re
|
||||||
|
import socket
|
||||||
import subprocess
|
import subprocess
|
||||||
import sys
|
import sys
|
||||||
|
|
||||||
@ -70,6 +71,11 @@ try:
|
|||||||
except ImportError:
|
except ImportError:
|
||||||
compat_subprocess_get_DEVNULL = lambda: open(os.path.devnull, 'w')
|
compat_subprocess_get_DEVNULL = lambda: open(os.path.devnull, 'w')
|
||||||
|
|
||||||
|
try:
|
||||||
|
import http.server as compat_http_server
|
||||||
|
except ImportError:
|
||||||
|
import BaseHTTPServer as compat_http_server
|
||||||
|
|
||||||
try:
|
try:
|
||||||
from urllib.parse import unquote as compat_urllib_parse_unquote
|
from urllib.parse import unquote as compat_urllib_parse_unquote
|
||||||
except ImportError:
|
except ImportError:
|
||||||
@ -108,6 +114,26 @@ except ImportError:
|
|||||||
string += pct_sequence.decode(encoding, errors)
|
string += pct_sequence.decode(encoding, errors)
|
||||||
return string
|
return string
|
||||||
|
|
||||||
|
try:
|
||||||
|
compat_str = unicode # Python 2
|
||||||
|
except NameError:
|
||||||
|
compat_str = str
|
||||||
|
|
||||||
|
try:
|
||||||
|
compat_basestring = basestring # Python 2
|
||||||
|
except NameError:
|
||||||
|
compat_basestring = str
|
||||||
|
|
||||||
|
try:
|
||||||
|
compat_chr = unichr # Python 2
|
||||||
|
except NameError:
|
||||||
|
compat_chr = chr
|
||||||
|
|
||||||
|
try:
|
||||||
|
from xml.etree.ElementTree import ParseError as compat_xml_parse_error
|
||||||
|
except ImportError: # Python 2.6
|
||||||
|
from xml.parsers.expat import ExpatError as compat_xml_parse_error
|
||||||
|
|
||||||
|
|
||||||
try:
|
try:
|
||||||
from urllib.parse import parse_qs as compat_parse_qs
|
from urllib.parse import parse_qs as compat_parse_qs
|
||||||
@ -117,7 +143,7 @@ except ImportError: # Python 2
|
|||||||
|
|
||||||
def _parse_qsl(qs, keep_blank_values=False, strict_parsing=False,
|
def _parse_qsl(qs, keep_blank_values=False, strict_parsing=False,
|
||||||
encoding='utf-8', errors='replace'):
|
encoding='utf-8', errors='replace'):
|
||||||
qs, _coerce_result = qs, unicode
|
qs, _coerce_result = qs, compat_str
|
||||||
pairs = [s2 for s1 in qs.split('&') for s2 in s1.split(';')]
|
pairs = [s2 for s1 in qs.split('&') for s2 in s1.split(';')]
|
||||||
r = []
|
r = []
|
||||||
for name_value in pairs:
|
for name_value in pairs:
|
||||||
@ -156,21 +182,6 @@ except ImportError: # Python 2
|
|||||||
parsed_result[name] = [value]
|
parsed_result[name] = [value]
|
||||||
return parsed_result
|
return parsed_result
|
||||||
|
|
||||||
try:
|
|
||||||
compat_str = unicode # Python 2
|
|
||||||
except NameError:
|
|
||||||
compat_str = str
|
|
||||||
|
|
||||||
try:
|
|
||||||
compat_chr = unichr # Python 2
|
|
||||||
except NameError:
|
|
||||||
compat_chr = chr
|
|
||||||
|
|
||||||
try:
|
|
||||||
from xml.etree.ElementTree import ParseError as compat_xml_parse_error
|
|
||||||
except ImportError: # Python 2.6
|
|
||||||
from xml.parsers.expat import ExpatError as compat_xml_parse_error
|
|
||||||
|
|
||||||
try:
|
try:
|
||||||
from shlex import quote as shlex_quote
|
from shlex import quote as shlex_quote
|
||||||
except ImportError: # Python < 3.3
|
except ImportError: # Python < 3.3
|
||||||
@ -247,7 +258,7 @@ else:
|
|||||||
userhome = compat_getenv('HOME')
|
userhome = compat_getenv('HOME')
|
||||||
elif 'USERPROFILE' in os.environ:
|
elif 'USERPROFILE' in os.environ:
|
||||||
userhome = compat_getenv('USERPROFILE')
|
userhome = compat_getenv('USERPROFILE')
|
||||||
elif not 'HOMEPATH' in os.environ:
|
elif 'HOMEPATH' not in os.environ:
|
||||||
return path
|
return path
|
||||||
else:
|
else:
|
||||||
try:
|
try:
|
||||||
@ -297,7 +308,9 @@ else:
|
|||||||
|
|
||||||
# Old 2.6 and 2.7 releases require kwargs to be bytes
|
# Old 2.6 and 2.7 releases require kwargs to be bytes
|
||||||
try:
|
try:
|
||||||
(lambda x: x)(**{'x': 0})
|
def _testfunc(x):
|
||||||
|
pass
|
||||||
|
_testfunc(**{'x': 0})
|
||||||
except TypeError:
|
except TypeError:
|
||||||
def compat_kwargs(kwargs):
|
def compat_kwargs(kwargs):
|
||||||
return dict((bytes(k), v) for k, v in kwargs.items())
|
return dict((bytes(k), v) for k, v in kwargs.items())
|
||||||
@ -305,6 +318,32 @@ else:
|
|||||||
compat_kwargs = lambda kwargs: kwargs
|
compat_kwargs = lambda kwargs: kwargs
|
||||||
|
|
||||||
|
|
||||||
|
if sys.version_info < (2, 7):
|
||||||
|
def compat_socket_create_connection(address, timeout, source_address=None):
|
||||||
|
host, port = address
|
||||||
|
err = None
|
||||||
|
for res in socket.getaddrinfo(host, port, 0, socket.SOCK_STREAM):
|
||||||
|
af, socktype, proto, canonname, sa = res
|
||||||
|
sock = None
|
||||||
|
try:
|
||||||
|
sock = socket.socket(af, socktype, proto)
|
||||||
|
sock.settimeout(timeout)
|
||||||
|
if source_address:
|
||||||
|
sock.bind(source_address)
|
||||||
|
sock.connect(sa)
|
||||||
|
return sock
|
||||||
|
except socket.error as _:
|
||||||
|
err = _
|
||||||
|
if sock is not None:
|
||||||
|
sock.close()
|
||||||
|
if err is not None:
|
||||||
|
raise err
|
||||||
|
else:
|
||||||
|
raise socket.error("getaddrinfo returns an empty list")
|
||||||
|
else:
|
||||||
|
compat_socket_create_connection = socket.create_connection
|
||||||
|
|
||||||
|
|
||||||
# Fix https://github.com/rg3/youtube-dl/issues/4223
|
# Fix https://github.com/rg3/youtube-dl/issues/4223
|
||||||
# See http://bugs.python.org/issue9161 for what is broken
|
# See http://bugs.python.org/issue9161 for what is broken
|
||||||
def workaround_optparse_bug9161():
|
def workaround_optparse_bug9161():
|
||||||
@ -328,6 +367,7 @@ def workaround_optparse_bug9161():
|
|||||||
|
|
||||||
__all__ = [
|
__all__ = [
|
||||||
'compat_HTTPError',
|
'compat_HTTPError',
|
||||||
|
'compat_basestring',
|
||||||
'compat_chr',
|
'compat_chr',
|
||||||
'compat_cookiejar',
|
'compat_cookiejar',
|
||||||
'compat_expanduser',
|
'compat_expanduser',
|
||||||
@ -336,10 +376,12 @@ __all__ = [
|
|||||||
'compat_html_entities',
|
'compat_html_entities',
|
||||||
'compat_html_parser',
|
'compat_html_parser',
|
||||||
'compat_http_client',
|
'compat_http_client',
|
||||||
|
'compat_http_server',
|
||||||
'compat_kwargs',
|
'compat_kwargs',
|
||||||
'compat_ord',
|
'compat_ord',
|
||||||
'compat_parse_qs',
|
'compat_parse_qs',
|
||||||
'compat_print',
|
'compat_print',
|
||||||
|
'compat_socket_create_connection',
|
||||||
'compat_str',
|
'compat_str',
|
||||||
'compat_subprocess_get_DEVNULL',
|
'compat_subprocess_get_DEVNULL',
|
||||||
'compat_urllib_error',
|
'compat_urllib_error',
|
||||||
|
@ -1,35 +1,44 @@
|
|||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
from .common import FileDownloader
|
from .common import FileDownloader
|
||||||
|
from .external import get_external_downloader
|
||||||
|
from .f4m import F4mFD
|
||||||
from .hls import HlsFD
|
from .hls import HlsFD
|
||||||
from .hls import NativeHlsFD
|
from .hls import NativeHlsFD
|
||||||
from .http import HttpFD
|
from .http import HttpFD
|
||||||
from .mplayer import MplayerFD
|
from .mplayer import MplayerFD
|
||||||
from .rtmp import RtmpFD
|
from .rtmp import RtmpFD
|
||||||
from .f4m import F4mFD
|
|
||||||
|
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
determine_ext,
|
determine_protocol,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
PROTOCOL_MAP = {
|
||||||
|
'rtmp': RtmpFD,
|
||||||
|
'm3u8_native': NativeHlsFD,
|
||||||
|
'm3u8': HlsFD,
|
||||||
|
'mms': MplayerFD,
|
||||||
|
'rtsp': MplayerFD,
|
||||||
|
'f4m': F4mFD,
|
||||||
|
}
|
||||||
|
|
||||||
def get_suitable_downloader(info_dict):
|
|
||||||
|
def get_suitable_downloader(info_dict, params={}):
|
||||||
"""Get the downloader class that can handle the info dict."""
|
"""Get the downloader class that can handle the info dict."""
|
||||||
url = info_dict['url']
|
protocol = determine_protocol(info_dict)
|
||||||
protocol = info_dict.get('protocol')
|
info_dict['protocol'] = protocol
|
||||||
|
|
||||||
if url.startswith('rtmp'):
|
external_downloader = params.get('external_downloader')
|
||||||
return RtmpFD
|
if external_downloader is not None:
|
||||||
if protocol == 'm3u8_native':
|
ed = get_external_downloader(external_downloader)
|
||||||
|
if ed.supports(info_dict):
|
||||||
|
return ed
|
||||||
|
|
||||||
|
if protocol == 'm3u8' and params.get('hls_prefer_native'):
|
||||||
return NativeHlsFD
|
return NativeHlsFD
|
||||||
if (protocol == 'm3u8') or (protocol is None and determine_ext(url) == 'm3u8'):
|
|
||||||
return HlsFD
|
return PROTOCOL_MAP.get(protocol, HttpFD)
|
||||||
if url.startswith('mms') or url.startswith('rtsp'):
|
|
||||||
return MplayerFD
|
|
||||||
if determine_ext(url) == 'f4m':
|
|
||||||
return F4mFD
|
|
||||||
else:
|
|
||||||
return HttpFD
|
|
||||||
|
|
||||||
__all__ = [
|
__all__ = [
|
||||||
'get_suitable_downloader',
|
'get_suitable_downloader',
|
||||||
|
@ -1,12 +1,12 @@
|
|||||||
from __future__ import unicode_literals
|
from __future__ import division, unicode_literals
|
||||||
|
|
||||||
import os
|
import os
|
||||||
import re
|
import re
|
||||||
import sys
|
import sys
|
||||||
import time
|
import time
|
||||||
|
|
||||||
|
from ..compat import compat_str
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
compat_str,
|
|
||||||
encodeFilename,
|
encodeFilename,
|
||||||
format_bytes,
|
format_bytes,
|
||||||
timeconvert,
|
timeconvert,
|
||||||
@ -25,21 +25,23 @@ class FileDownloader(object):
|
|||||||
|
|
||||||
Available options:
|
Available options:
|
||||||
|
|
||||||
verbose: Print additional info to stdout.
|
verbose: Print additional info to stdout.
|
||||||
quiet: Do not print messages to stdout.
|
quiet: Do not print messages to stdout.
|
||||||
ratelimit: Download speed limit, in bytes/sec.
|
ratelimit: Download speed limit, in bytes/sec.
|
||||||
retries: Number of times to retry for HTTP error 5xx
|
retries: Number of times to retry for HTTP error 5xx
|
||||||
buffersize: Size of download buffer in bytes.
|
buffersize: Size of download buffer in bytes.
|
||||||
noresizebuffer: Do not automatically resize the download buffer.
|
noresizebuffer: Do not automatically resize the download buffer.
|
||||||
continuedl: Try to continue downloads if possible.
|
continuedl: Try to continue downloads if possible.
|
||||||
noprogress: Do not print the progress bar.
|
noprogress: Do not print the progress bar.
|
||||||
logtostderr: Log messages to stderr instead of stdout.
|
logtostderr: Log messages to stderr instead of stdout.
|
||||||
consoletitle: Display progress in console window's titlebar.
|
consoletitle: Display progress in console window's titlebar.
|
||||||
nopart: Do not use temporary .part files.
|
nopart: Do not use temporary .part files.
|
||||||
updatetime: Use the Last-modified header to set output file timestamps.
|
updatetime: Use the Last-modified header to set output file timestamps.
|
||||||
test: Download only first bytes to test the downloader.
|
test: Download only first bytes to test the downloader.
|
||||||
min_filesize: Skip files smaller than this size
|
min_filesize: Skip files smaller than this size
|
||||||
max_filesize: Skip files larger than this size
|
max_filesize: Skip files larger than this size
|
||||||
|
xattr_set_filesize: Set ytdl.filesize user xattribute with expected size.
|
||||||
|
(experimenatal)
|
||||||
|
|
||||||
Subclasses of this one must re-define the real_download method.
|
Subclasses of this one must re-define the real_download method.
|
||||||
"""
|
"""
|
||||||
@ -52,6 +54,7 @@ class FileDownloader(object):
|
|||||||
self.ydl = ydl
|
self.ydl = ydl
|
||||||
self._progress_hooks = []
|
self._progress_hooks = []
|
||||||
self.params = params
|
self.params = params
|
||||||
|
self.add_progress_hook(self.report_progress)
|
||||||
|
|
||||||
@staticmethod
|
@staticmethod
|
||||||
def format_seconds(seconds):
|
def format_seconds(seconds):
|
||||||
@ -80,6 +83,8 @@ class FileDownloader(object):
|
|||||||
def calc_eta(start, now, total, current):
|
def calc_eta(start, now, total, current):
|
||||||
if total is None:
|
if total is None:
|
||||||
return None
|
return None
|
||||||
|
if now is None:
|
||||||
|
now = time.time()
|
||||||
dif = now - start
|
dif = now - start
|
||||||
if current == 0 or dif < 0.001: # One millisecond
|
if current == 0 or dif < 0.001: # One millisecond
|
||||||
return None
|
return None
|
||||||
@ -146,18 +151,19 @@ class FileDownloader(object):
|
|||||||
def report_error(self, *args, **kargs):
|
def report_error(self, *args, **kargs):
|
||||||
self.ydl.report_error(*args, **kargs)
|
self.ydl.report_error(*args, **kargs)
|
||||||
|
|
||||||
def slow_down(self, start_time, byte_counter):
|
def slow_down(self, start_time, now, byte_counter):
|
||||||
"""Sleep if the download speed is over the rate limit."""
|
"""Sleep if the download speed is over the rate limit."""
|
||||||
rate_limit = self.params.get('ratelimit', None)
|
rate_limit = self.params.get('ratelimit', None)
|
||||||
if rate_limit is None or byte_counter == 0:
|
if rate_limit is None or byte_counter == 0:
|
||||||
return
|
return
|
||||||
now = time.time()
|
if now is None:
|
||||||
|
now = time.time()
|
||||||
elapsed = now - start_time
|
elapsed = now - start_time
|
||||||
if elapsed <= 0.0:
|
if elapsed <= 0.0:
|
||||||
return
|
return
|
||||||
speed = float(byte_counter) / elapsed
|
speed = float(byte_counter) / elapsed
|
||||||
if speed > rate_limit:
|
if speed > rate_limit:
|
||||||
time.sleep((byte_counter - rate_limit * (now - start_time)) / rate_limit)
|
time.sleep(max((byte_counter // rate_limit) - elapsed, 0))
|
||||||
|
|
||||||
def temp_name(self, filename):
|
def temp_name(self, filename):
|
||||||
"""Returns a temporary filename for the given filename."""
|
"""Returns a temporary filename for the given filename."""
|
||||||
@ -221,42 +227,64 @@ class FileDownloader(object):
|
|||||||
self.to_screen(clear_line + fullmsg, skip_eol=not is_last_line)
|
self.to_screen(clear_line + fullmsg, skip_eol=not is_last_line)
|
||||||
self.to_console_title('youtube-dl ' + msg)
|
self.to_console_title('youtube-dl ' + msg)
|
||||||
|
|
||||||
def report_progress(self, percent, data_len_str, speed, eta):
|
def report_progress(self, s):
|
||||||
"""Report download progress."""
|
if s['status'] == 'finished':
|
||||||
if self.params.get('noprogress', False):
|
if self.params.get('noprogress', False):
|
||||||
|
self.to_screen('[download] Download completed')
|
||||||
|
else:
|
||||||
|
s['_total_bytes_str'] = format_bytes(s['total_bytes'])
|
||||||
|
if s.get('elapsed') is not None:
|
||||||
|
s['_elapsed_str'] = self.format_seconds(s['elapsed'])
|
||||||
|
msg_template = '100%% of %(_total_bytes_str)s in %(_elapsed_str)s'
|
||||||
|
else:
|
||||||
|
msg_template = '100%% of %(_total_bytes_str)s'
|
||||||
|
self._report_progress_status(
|
||||||
|
msg_template % s, is_last_line=True)
|
||||||
|
|
||||||
|
if self.params.get('noprogress'):
|
||||||
return
|
return
|
||||||
if eta is not None:
|
|
||||||
eta_str = self.format_eta(eta)
|
|
||||||
else:
|
|
||||||
eta_str = 'Unknown ETA'
|
|
||||||
if percent is not None:
|
|
||||||
percent_str = self.format_percent(percent)
|
|
||||||
else:
|
|
||||||
percent_str = 'Unknown %'
|
|
||||||
speed_str = self.format_speed(speed)
|
|
||||||
|
|
||||||
msg = ('%s of %s at %s ETA %s' %
|
if s['status'] != 'downloading':
|
||||||
(percent_str, data_len_str, speed_str, eta_str))
|
|
||||||
self._report_progress_status(msg)
|
|
||||||
|
|
||||||
def report_progress_live_stream(self, downloaded_data_len, speed, elapsed):
|
|
||||||
if self.params.get('noprogress', False):
|
|
||||||
return
|
return
|
||||||
downloaded_str = format_bytes(downloaded_data_len)
|
|
||||||
speed_str = self.format_speed(speed)
|
|
||||||
elapsed_str = FileDownloader.format_seconds(elapsed)
|
|
||||||
msg = '%s at %s (%s)' % (downloaded_str, speed_str, elapsed_str)
|
|
||||||
self._report_progress_status(msg)
|
|
||||||
|
|
||||||
def report_finish(self, data_len_str, tot_time):
|
if s.get('eta') is not None:
|
||||||
"""Report download finished."""
|
s['_eta_str'] = self.format_eta(s['eta'])
|
||||||
if self.params.get('noprogress', False):
|
|
||||||
self.to_screen('[download] Download completed')
|
|
||||||
else:
|
else:
|
||||||
self._report_progress_status(
|
s['_eta_str'] = 'Unknown ETA'
|
||||||
('100%% of %s in %s' %
|
|
||||||
(data_len_str, self.format_seconds(tot_time))),
|
if s.get('total_bytes') and s.get('downloaded_bytes') is not None:
|
||||||
is_last_line=True)
|
s['_percent_str'] = self.format_percent(100 * s['downloaded_bytes'] / s['total_bytes'])
|
||||||
|
elif s.get('total_bytes_estimate') and s.get('downloaded_bytes') is not None:
|
||||||
|
s['_percent_str'] = self.format_percent(100 * s['downloaded_bytes'] / s['total_bytes_estimate'])
|
||||||
|
else:
|
||||||
|
if s.get('downloaded_bytes') == 0:
|
||||||
|
s['_percent_str'] = self.format_percent(0)
|
||||||
|
else:
|
||||||
|
s['_percent_str'] = 'Unknown %'
|
||||||
|
|
||||||
|
if s.get('speed') is not None:
|
||||||
|
s['_speed_str'] = self.format_speed(s['speed'])
|
||||||
|
else:
|
||||||
|
s['_speed_str'] = 'Unknown speed'
|
||||||
|
|
||||||
|
if s.get('total_bytes') is not None:
|
||||||
|
s['_total_bytes_str'] = format_bytes(s['total_bytes'])
|
||||||
|
msg_template = '%(_percent_str)s of %(_total_bytes_str)s at %(_speed_str)s ETA %(_eta_str)s'
|
||||||
|
elif s.get('total_bytes_estimate') is not None:
|
||||||
|
s['_total_bytes_estimate_str'] = format_bytes(s['total_bytes_estimate'])
|
||||||
|
msg_template = '%(_percent_str)s of ~%(_total_bytes_estimate_str)s at %(_speed_str)s ETA %(_eta_str)s'
|
||||||
|
else:
|
||||||
|
if s.get('downloaded_bytes') is not None:
|
||||||
|
s['_downloaded_bytes_str'] = format_bytes(s['downloaded_bytes'])
|
||||||
|
if s.get('elapsed'):
|
||||||
|
s['_elapsed_str'] = self.format_seconds(s['elapsed'])
|
||||||
|
msg_template = '%(_downloaded_bytes_str)s at %(_speed_str)s (%(_elapsed_str)s)'
|
||||||
|
else:
|
||||||
|
msg_template = '%(_downloaded_bytes_str)s at %(_speed_str)s'
|
||||||
|
else:
|
||||||
|
msg_template = '%(_percent_str)s % at %(_speed_str)s ETA %(_eta_str)s'
|
||||||
|
|
||||||
|
self._report_progress_status(msg_template % s)
|
||||||
|
|
||||||
def report_resuming_byte(self, resume_len):
|
def report_resuming_byte(self, resume_len):
|
||||||
"""Report attempt to resume at given byte."""
|
"""Report attempt to resume at given byte."""
|
||||||
@ -281,8 +309,20 @@ class FileDownloader(object):
|
|||||||
"""Download to a filename using the info from info_dict
|
"""Download to a filename using the info from info_dict
|
||||||
Return True on success and False otherwise
|
Return True on success and False otherwise
|
||||||
"""
|
"""
|
||||||
|
|
||||||
|
nooverwrites_and_exists = (
|
||||||
|
self.params.get('nooverwrites', False)
|
||||||
|
and os.path.exists(encodeFilename(filename))
|
||||||
|
)
|
||||||
|
|
||||||
|
continuedl_and_exists = (
|
||||||
|
self.params.get('continuedl', False)
|
||||||
|
and os.path.isfile(encodeFilename(filename))
|
||||||
|
and not self.params.get('nopart', False)
|
||||||
|
)
|
||||||
|
|
||||||
# Check file already present
|
# Check file already present
|
||||||
if self.params.get('continuedl', False) and os.path.isfile(encodeFilename(filename)) and not self.params.get('nopart', False):
|
if filename != '-' and nooverwrites_and_exists or continuedl_and_exists:
|
||||||
self.report_file_already_downloaded(filename)
|
self.report_file_already_downloaded(filename)
|
||||||
self._hook_progress({
|
self._hook_progress({
|
||||||
'filename': filename,
|
'filename': filename,
|
||||||
@ -291,6 +331,11 @@ class FileDownloader(object):
|
|||||||
})
|
})
|
||||||
return True
|
return True
|
||||||
|
|
||||||
|
sleep_interval = self.params.get('sleep_interval')
|
||||||
|
if sleep_interval:
|
||||||
|
self.to_screen('[download] Sleeping %s seconds...' % sleep_interval)
|
||||||
|
time.sleep(sleep_interval)
|
||||||
|
|
||||||
return self.real_download(filename, info_dict)
|
return self.real_download(filename, info_dict)
|
||||||
|
|
||||||
def real_download(self, filename, info_dict):
|
def real_download(self, filename, info_dict):
|
||||||
@ -302,19 +347,27 @@ class FileDownloader(object):
|
|||||||
ph(status)
|
ph(status)
|
||||||
|
|
||||||
def add_progress_hook(self, ph):
|
def add_progress_hook(self, ph):
|
||||||
""" ph gets called on download progress, with a dictionary with the entries
|
# See YoutubeDl.py (search for progress_hooks) for a description of
|
||||||
* filename: The final filename
|
# this interface
|
||||||
* status: One of "downloading" and "finished"
|
|
||||||
|
|
||||||
It can also have some of the following entries:
|
|
||||||
|
|
||||||
* downloaded_bytes: Bytes on disks
|
|
||||||
* total_bytes: Total bytes, None if unknown
|
|
||||||
* tmpfilename: The filename we're currently writing to
|
|
||||||
* eta: The estimated time in seconds, None if unknown
|
|
||||||
* speed: The download speed in bytes/second, None if unknown
|
|
||||||
|
|
||||||
Hooks are guaranteed to be called at least once (with status "finished")
|
|
||||||
if the download is successful.
|
|
||||||
"""
|
|
||||||
self._progress_hooks.append(ph)
|
self._progress_hooks.append(ph)
|
||||||
|
|
||||||
|
def _debug_cmd(self, args, subprocess_encoding, exe=None):
|
||||||
|
if not self.params.get('verbose', False):
|
||||||
|
return
|
||||||
|
|
||||||
|
if exe is None:
|
||||||
|
exe = os.path.basename(args[0])
|
||||||
|
|
||||||
|
if subprocess_encoding:
|
||||||
|
str_args = [
|
||||||
|
a.decode(subprocess_encoding) if isinstance(a, bytes) else a
|
||||||
|
for a in args]
|
||||||
|
else:
|
||||||
|
str_args = args
|
||||||
|
try:
|
||||||
|
import pipes
|
||||||
|
shell_quote = lambda args: ' '.join(map(pipes.quote, str_args))
|
||||||
|
except ImportError:
|
||||||
|
shell_quote = repr
|
||||||
|
self.to_screen('[debug] %s command line: %s' % (
|
||||||
|
exe, shell_quote(str_args)))
|
||||||
|
126
youtube_dl/downloader/external.py
Normal file
126
youtube_dl/downloader/external.py
Normal file
@ -0,0 +1,126 @@
|
|||||||
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
import os.path
|
||||||
|
import subprocess
|
||||||
|
import sys
|
||||||
|
|
||||||
|
from .common import FileDownloader
|
||||||
|
from ..utils import (
|
||||||
|
encodeFilename,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
class ExternalFD(FileDownloader):
|
||||||
|
def real_download(self, filename, info_dict):
|
||||||
|
self.report_destination(filename)
|
||||||
|
tmpfilename = self.temp_name(filename)
|
||||||
|
|
||||||
|
retval = self._call_downloader(tmpfilename, info_dict)
|
||||||
|
if retval == 0:
|
||||||
|
fsize = os.path.getsize(encodeFilename(tmpfilename))
|
||||||
|
self.to_screen('\r[%s] Downloaded %s bytes' % (self.get_basename(), fsize))
|
||||||
|
self.try_rename(tmpfilename, filename)
|
||||||
|
self._hook_progress({
|
||||||
|
'downloaded_bytes': fsize,
|
||||||
|
'total_bytes': fsize,
|
||||||
|
'filename': filename,
|
||||||
|
'status': 'finished',
|
||||||
|
})
|
||||||
|
return True
|
||||||
|
else:
|
||||||
|
self.to_stderr('\n')
|
||||||
|
self.report_error('%s exited with code %d' % (
|
||||||
|
self.get_basename(), retval))
|
||||||
|
return False
|
||||||
|
|
||||||
|
@classmethod
|
||||||
|
def get_basename(cls):
|
||||||
|
return cls.__name__[:-2].lower()
|
||||||
|
|
||||||
|
@property
|
||||||
|
def exe(self):
|
||||||
|
return self.params.get('external_downloader')
|
||||||
|
|
||||||
|
@classmethod
|
||||||
|
def supports(cls, info_dict):
|
||||||
|
return info_dict['protocol'] in ('http', 'https', 'ftp', 'ftps')
|
||||||
|
|
||||||
|
def _source_address(self, command_option):
|
||||||
|
source_address = self.params.get('source_address')
|
||||||
|
if source_address is None:
|
||||||
|
return []
|
||||||
|
return [command_option, source_address]
|
||||||
|
|
||||||
|
def _call_downloader(self, tmpfilename, info_dict):
|
||||||
|
""" Either overwrite this or implement _make_cmd """
|
||||||
|
cmd = self._make_cmd(tmpfilename, info_dict)
|
||||||
|
|
||||||
|
if sys.platform == 'win32' and sys.version_info < (3, 0):
|
||||||
|
# Windows subprocess module does not actually support Unicode
|
||||||
|
# on Python 2.x
|
||||||
|
# See http://stackoverflow.com/a/9951851/35070
|
||||||
|
subprocess_encoding = sys.getfilesystemencoding()
|
||||||
|
cmd = [a.encode(subprocess_encoding, 'ignore') for a in cmd]
|
||||||
|
else:
|
||||||
|
subprocess_encoding = None
|
||||||
|
self._debug_cmd(cmd, subprocess_encoding)
|
||||||
|
|
||||||
|
p = subprocess.Popen(
|
||||||
|
cmd, stderr=subprocess.PIPE)
|
||||||
|
_, stderr = p.communicate()
|
||||||
|
if p.returncode != 0:
|
||||||
|
self.to_stderr(stderr)
|
||||||
|
return p.returncode
|
||||||
|
|
||||||
|
|
||||||
|
class CurlFD(ExternalFD):
|
||||||
|
def _make_cmd(self, tmpfilename, info_dict):
|
||||||
|
cmd = [self.exe, '--location', '-o', tmpfilename]
|
||||||
|
for key, val in info_dict['http_headers'].items():
|
||||||
|
cmd += ['--header', '%s: %s' % (key, val)]
|
||||||
|
cmd += self._source_address('--interface')
|
||||||
|
cmd += ['--', info_dict['url']]
|
||||||
|
return cmd
|
||||||
|
|
||||||
|
|
||||||
|
class WgetFD(ExternalFD):
|
||||||
|
def _make_cmd(self, tmpfilename, info_dict):
|
||||||
|
cmd = [self.exe, '-O', tmpfilename, '-nv', '--no-cookies']
|
||||||
|
for key, val in info_dict['http_headers'].items():
|
||||||
|
cmd += ['--header', '%s: %s' % (key, val)]
|
||||||
|
cmd += self._source_address('--bind-address')
|
||||||
|
cmd += ['--', info_dict['url']]
|
||||||
|
return cmd
|
||||||
|
|
||||||
|
|
||||||
|
class Aria2cFD(ExternalFD):
|
||||||
|
def _make_cmd(self, tmpfilename, info_dict):
|
||||||
|
cmd = [
|
||||||
|
self.exe, '-c',
|
||||||
|
'--min-split-size', '1M', '--max-connection-per-server', '4']
|
||||||
|
dn = os.path.dirname(tmpfilename)
|
||||||
|
if dn:
|
||||||
|
cmd += ['--dir', dn]
|
||||||
|
cmd += ['--out', os.path.basename(tmpfilename)]
|
||||||
|
for key, val in info_dict['http_headers'].items():
|
||||||
|
cmd += ['--header', '%s: %s' % (key, val)]
|
||||||
|
cmd += self._source_address('--interface')
|
||||||
|
cmd += ['--', info_dict['url']]
|
||||||
|
return cmd
|
||||||
|
|
||||||
|
_BY_NAME = dict(
|
||||||
|
(klass.get_basename(), klass)
|
||||||
|
for name, klass in globals().items()
|
||||||
|
if name.endswith('FD') and name != 'ExternalFD'
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def list_external_downloaders():
|
||||||
|
return sorted(_BY_NAME.keys())
|
||||||
|
|
||||||
|
|
||||||
|
def get_external_downloader(external_downloader):
|
||||||
|
""" Given the name of the executable, see whether we support the given
|
||||||
|
downloader . """
|
||||||
|
bn = os.path.basename(external_downloader)
|
||||||
|
return _BY_NAME[bn]
|
@ -1,4 +1,4 @@
|
|||||||
from __future__ import unicode_literals
|
from __future__ import division, unicode_literals
|
||||||
|
|
||||||
import base64
|
import base64
|
||||||
import io
|
import io
|
||||||
@ -9,11 +9,12 @@ import xml.etree.ElementTree as etree
|
|||||||
|
|
||||||
from .common import FileDownloader
|
from .common import FileDownloader
|
||||||
from .http import HttpFD
|
from .http import HttpFD
|
||||||
|
from ..compat import (
|
||||||
|
compat_urlparse,
|
||||||
|
)
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
struct_pack,
|
struct_pack,
|
||||||
struct_unpack,
|
struct_unpack,
|
||||||
compat_urlparse,
|
|
||||||
format_bytes,
|
|
||||||
encodeFilename,
|
encodeFilename,
|
||||||
sanitize_open,
|
sanitize_open,
|
||||||
xpath_text,
|
xpath_text,
|
||||||
@ -175,34 +176,43 @@ def build_fragments_list(boot_info):
|
|||||||
""" Return a list of (segment, fragment) for each fragment in the video """
|
""" Return a list of (segment, fragment) for each fragment in the video """
|
||||||
res = []
|
res = []
|
||||||
segment_run_table = boot_info['segments'][0]
|
segment_run_table = boot_info['segments'][0]
|
||||||
# I've only found videos with one segment
|
|
||||||
segment_run_entry = segment_run_table['segment_run'][0]
|
|
||||||
n_frags = segment_run_entry[1]
|
|
||||||
fragment_run_entry_table = boot_info['fragments'][0]['fragments']
|
fragment_run_entry_table = boot_info['fragments'][0]['fragments']
|
||||||
first_frag_number = fragment_run_entry_table[0]['first']
|
first_frag_number = fragment_run_entry_table[0]['first']
|
||||||
for (i, frag_number) in zip(range(1, n_frags + 1), itertools.count(first_frag_number)):
|
fragments_counter = itertools.count(first_frag_number)
|
||||||
res.append((1, frag_number))
|
for segment, fragments_count in segment_run_table['segment_run']:
|
||||||
|
for _ in range(fragments_count):
|
||||||
|
res.append((segment, next(fragments_counter)))
|
||||||
return res
|
return res
|
||||||
|
|
||||||
|
|
||||||
def write_flv_header(stream, metadata):
|
def write_unsigned_int(stream, val):
|
||||||
"""Writes the FLV header and the metadata to stream"""
|
stream.write(struct_pack('!I', val))
|
||||||
|
|
||||||
|
|
||||||
|
def write_unsigned_int_24(stream, val):
|
||||||
|
stream.write(struct_pack('!I', val)[1:])
|
||||||
|
|
||||||
|
|
||||||
|
def write_flv_header(stream):
|
||||||
|
"""Writes the FLV header to stream"""
|
||||||
# FLV header
|
# FLV header
|
||||||
stream.write(b'FLV\x01')
|
stream.write(b'FLV\x01')
|
||||||
stream.write(b'\x05')
|
stream.write(b'\x05')
|
||||||
stream.write(b'\x00\x00\x00\x09')
|
stream.write(b'\x00\x00\x00\x09')
|
||||||
# FLV File body
|
|
||||||
stream.write(b'\x00\x00\x00\x00')
|
stream.write(b'\x00\x00\x00\x00')
|
||||||
# FLVTAG
|
|
||||||
# Script data
|
|
||||||
stream.write(b'\x12')
|
def write_metadata_tag(stream, metadata):
|
||||||
# Size of the metadata with 3 bytes
|
"""Writes optional metadata tag to stream"""
|
||||||
stream.write(struct_pack('!L', len(metadata))[1:])
|
SCRIPT_TAG = b'\x12'
|
||||||
stream.write(b'\x00\x00\x00\x00\x00\x00\x00')
|
FLV_TAG_HEADER_LEN = 11
|
||||||
stream.write(metadata)
|
|
||||||
# Magic numbers extracted from the output files produced by AdobeHDS.php
|
if metadata:
|
||||||
#(https://github.com/K-S-V/Scripts)
|
stream.write(SCRIPT_TAG)
|
||||||
stream.write(b'\x00\x00\x01\x73')
|
write_unsigned_int_24(stream, len(metadata))
|
||||||
|
stream.write(b'\x00\x00\x00\x00\x00\x00\x00')
|
||||||
|
stream.write(metadata)
|
||||||
|
write_unsigned_int(stream, FLV_TAG_HEADER_LEN + len(metadata))
|
||||||
|
|
||||||
|
|
||||||
def _add_ns(prop):
|
def _add_ns(prop):
|
||||||
@ -219,24 +229,32 @@ class F4mFD(FileDownloader):
|
|||||||
A downloader for f4m manifests or AdobeHDS.
|
A downloader for f4m manifests or AdobeHDS.
|
||||||
"""
|
"""
|
||||||
|
|
||||||
|
def _get_unencrypted_media(self, doc):
|
||||||
|
media = doc.findall(_add_ns('media'))
|
||||||
|
if not media:
|
||||||
|
self.report_error('No media found')
|
||||||
|
for e in (doc.findall(_add_ns('drmAdditionalHeader')) +
|
||||||
|
doc.findall(_add_ns('drmAdditionalHeaderSet'))):
|
||||||
|
# If id attribute is missing it's valid for all media nodes
|
||||||
|
# without drmAdditionalHeaderId or drmAdditionalHeaderSetId attribute
|
||||||
|
if 'id' not in e.attrib:
|
||||||
|
self.report_error('Missing ID in f4m DRM')
|
||||||
|
media = list(filter(lambda e: 'drmAdditionalHeaderId' not in e.attrib and
|
||||||
|
'drmAdditionalHeaderSetId' not in e.attrib,
|
||||||
|
media))
|
||||||
|
if not media:
|
||||||
|
self.report_error('Unsupported DRM')
|
||||||
|
return media
|
||||||
|
|
||||||
def real_download(self, filename, info_dict):
|
def real_download(self, filename, info_dict):
|
||||||
man_url = info_dict['url']
|
man_url = info_dict['url']
|
||||||
requested_bitrate = info_dict.get('tbr')
|
requested_bitrate = info_dict.get('tbr')
|
||||||
self.to_screen('[download] Downloading f4m manifest')
|
self.to_screen('[download] Downloading f4m manifest')
|
||||||
manifest = self.ydl.urlopen(man_url).read()
|
manifest = self.ydl.urlopen(man_url).read()
|
||||||
self.report_destination(filename)
|
|
||||||
http_dl = HttpQuietDownloader(
|
|
||||||
self.ydl,
|
|
||||||
{
|
|
||||||
'continuedl': True,
|
|
||||||
'quiet': True,
|
|
||||||
'noprogress': True,
|
|
||||||
'test': self.params.get('test', False),
|
|
||||||
}
|
|
||||||
)
|
|
||||||
|
|
||||||
doc = etree.fromstring(manifest)
|
doc = etree.fromstring(manifest)
|
||||||
formats = [(int(f.attrib.get('bitrate', -1)), f) for f in doc.findall(_add_ns('media'))]
|
formats = [(int(f.attrib.get('bitrate', -1)), f)
|
||||||
|
for f in self._get_unencrypted_media(doc)]
|
||||||
if requested_bitrate is None:
|
if requested_bitrate is None:
|
||||||
# get the best format
|
# get the best format
|
||||||
formats = sorted(formats, key=lambda f: f[0])
|
formats = sorted(formats, key=lambda f: f[0])
|
||||||
@ -253,7 +271,11 @@ class F4mFD(FileDownloader):
|
|||||||
bootstrap = self.ydl.urlopen(bootstrap_url).read()
|
bootstrap = self.ydl.urlopen(bootstrap_url).read()
|
||||||
else:
|
else:
|
||||||
bootstrap = base64.b64decode(bootstrap_node.text)
|
bootstrap = base64.b64decode(bootstrap_node.text)
|
||||||
metadata = base64.b64decode(media.find(_add_ns('metadata')).text)
|
metadata_node = media.find(_add_ns('metadata'))
|
||||||
|
if metadata_node is not None:
|
||||||
|
metadata = base64.b64decode(metadata_node.text)
|
||||||
|
else:
|
||||||
|
metadata = None
|
||||||
boot_info = read_bootstrap_info(bootstrap)
|
boot_info = read_bootstrap_info(bootstrap)
|
||||||
|
|
||||||
fragments_list = build_fragments_list(boot_info)
|
fragments_list = build_fragments_list(boot_info)
|
||||||
@ -264,38 +286,65 @@ class F4mFD(FileDownloader):
|
|||||||
# For some akamai manifests we'll need to add a query to the fragment url
|
# For some akamai manifests we'll need to add a query to the fragment url
|
||||||
akamai_pv = xpath_text(doc, _add_ns('pv-2.0'))
|
akamai_pv = xpath_text(doc, _add_ns('pv-2.0'))
|
||||||
|
|
||||||
|
self.report_destination(filename)
|
||||||
|
http_dl = HttpQuietDownloader(
|
||||||
|
self.ydl,
|
||||||
|
{
|
||||||
|
'continuedl': True,
|
||||||
|
'quiet': True,
|
||||||
|
'noprogress': True,
|
||||||
|
'ratelimit': self.params.get('ratelimit', None),
|
||||||
|
'test': self.params.get('test', False),
|
||||||
|
}
|
||||||
|
)
|
||||||
tmpfilename = self.temp_name(filename)
|
tmpfilename = self.temp_name(filename)
|
||||||
(dest_stream, tmpfilename) = sanitize_open(tmpfilename, 'wb')
|
(dest_stream, tmpfilename) = sanitize_open(tmpfilename, 'wb')
|
||||||
write_flv_header(dest_stream, metadata)
|
|
||||||
|
write_flv_header(dest_stream)
|
||||||
|
write_metadata_tag(dest_stream, metadata)
|
||||||
|
|
||||||
# This dict stores the download progress, it's updated by the progress
|
# This dict stores the download progress, it's updated by the progress
|
||||||
# hook
|
# hook
|
||||||
state = {
|
state = {
|
||||||
|
'status': 'downloading',
|
||||||
'downloaded_bytes': 0,
|
'downloaded_bytes': 0,
|
||||||
'frag_counter': 0,
|
'frag_index': 0,
|
||||||
|
'frag_count': total_frags,
|
||||||
|
'filename': filename,
|
||||||
|
'tmpfilename': tmpfilename,
|
||||||
}
|
}
|
||||||
start = time.time()
|
start = time.time()
|
||||||
|
|
||||||
def frag_progress_hook(status):
|
def frag_progress_hook(s):
|
||||||
frag_total_bytes = status.get('total_bytes', 0)
|
if s['status'] not in ('downloading', 'finished'):
|
||||||
estimated_size = (state['downloaded_bytes'] +
|
return
|
||||||
(total_frags - state['frag_counter']) * frag_total_bytes)
|
|
||||||
if status['status'] == 'finished':
|
frag_total_bytes = s.get('total_bytes', 0)
|
||||||
|
if s['status'] == 'finished':
|
||||||
state['downloaded_bytes'] += frag_total_bytes
|
state['downloaded_bytes'] += frag_total_bytes
|
||||||
state['frag_counter'] += 1
|
state['frag_index'] += 1
|
||||||
progress = self.calc_percent(state['frag_counter'], total_frags)
|
|
||||||
byte_counter = state['downloaded_bytes']
|
estimated_size = (
|
||||||
|
(state['downloaded_bytes'] + frag_total_bytes)
|
||||||
|
/ (state['frag_index'] + 1) * total_frags)
|
||||||
|
time_now = time.time()
|
||||||
|
state['total_bytes_estimate'] = estimated_size
|
||||||
|
state['elapsed'] = time_now - start
|
||||||
|
|
||||||
|
if s['status'] == 'finished':
|
||||||
|
progress = self.calc_percent(state['frag_index'], total_frags)
|
||||||
else:
|
else:
|
||||||
frag_downloaded_bytes = status['downloaded_bytes']
|
frag_downloaded_bytes = s['downloaded_bytes']
|
||||||
byte_counter = state['downloaded_bytes'] + frag_downloaded_bytes
|
|
||||||
frag_progress = self.calc_percent(frag_downloaded_bytes,
|
frag_progress = self.calc_percent(frag_downloaded_bytes,
|
||||||
frag_total_bytes)
|
frag_total_bytes)
|
||||||
progress = self.calc_percent(state['frag_counter'], total_frags)
|
progress = self.calc_percent(state['frag_index'], total_frags)
|
||||||
progress += frag_progress / float(total_frags)
|
progress += frag_progress / float(total_frags)
|
||||||
|
|
||||||
eta = self.calc_eta(start, time.time(), estimated_size, byte_counter)
|
state['eta'] = self.calc_eta(
|
||||||
self.report_progress(progress, format_bytes(estimated_size),
|
start, time_now, estimated_size, state['downloaded_bytes'] + frag_downloaded_bytes)
|
||||||
status.get('speed'), eta)
|
state['speed'] = s.get('speed')
|
||||||
|
self._hook_progress(state)
|
||||||
|
|
||||||
http_dl.add_progress_hook(frag_progress_hook)
|
http_dl.add_progress_hook(frag_progress_hook)
|
||||||
|
|
||||||
frags_filenames = []
|
frags_filenames = []
|
||||||
@ -319,8 +368,8 @@ class F4mFD(FileDownloader):
|
|||||||
frags_filenames.append(frag_filename)
|
frags_filenames.append(frag_filename)
|
||||||
|
|
||||||
dest_stream.close()
|
dest_stream.close()
|
||||||
self.report_finish(format_bytes(state['downloaded_bytes']), time.time() - start)
|
|
||||||
|
|
||||||
|
elapsed = time.time() - start
|
||||||
self.try_rename(tmpfilename, filename)
|
self.try_rename(tmpfilename, filename)
|
||||||
for frag_file in frags_filenames:
|
for frag_file in frags_filenames:
|
||||||
os.remove(frag_file)
|
os.remove(frag_file)
|
||||||
@ -331,6 +380,7 @@ class F4mFD(FileDownloader):
|
|||||||
'total_bytes': fsize,
|
'total_bytes': fsize,
|
||||||
'filename': filename,
|
'filename': filename,
|
||||||
'status': 'finished',
|
'status': 'finished',
|
||||||
|
'elapsed': elapsed,
|
||||||
})
|
})
|
||||||
|
|
||||||
return True
|
return True
|
||||||
|
@ -4,11 +4,14 @@ import os
|
|||||||
import re
|
import re
|
||||||
import subprocess
|
import subprocess
|
||||||
|
|
||||||
|
from ..postprocessor.ffmpeg import FFmpegPostProcessor
|
||||||
from .common import FileDownloader
|
from .common import FileDownloader
|
||||||
from ..utils import (
|
from ..compat import (
|
||||||
compat_urlparse,
|
compat_urlparse,
|
||||||
compat_urllib_request,
|
compat_urllib_request,
|
||||||
check_executable,
|
)
|
||||||
|
from ..utils import (
|
||||||
|
encodeArgument,
|
||||||
encodeFilename,
|
encodeFilename,
|
||||||
)
|
)
|
||||||
|
|
||||||
@ -19,23 +22,21 @@ class HlsFD(FileDownloader):
|
|||||||
self.report_destination(filename)
|
self.report_destination(filename)
|
||||||
tmpfilename = self.temp_name(filename)
|
tmpfilename = self.temp_name(filename)
|
||||||
|
|
||||||
args = [
|
ffpp = FFmpegPostProcessor(downloader=self)
|
||||||
'-y', '-i', url, '-f', 'mp4', '-c', 'copy',
|
if not ffpp.available:
|
||||||
'-bsf:a', 'aac_adtstoasc',
|
|
||||||
encodeFilename(tmpfilename, for_subprocess=True)]
|
|
||||||
|
|
||||||
for program in ['avconv', 'ffmpeg']:
|
|
||||||
if check_executable(program, ['-version']):
|
|
||||||
break
|
|
||||||
else:
|
|
||||||
self.report_error('m3u8 download detected but ffmpeg or avconv could not be found. Please install one.')
|
self.report_error('m3u8 download detected but ffmpeg or avconv could not be found. Please install one.')
|
||||||
return False
|
return False
|
||||||
cmd = [program] + args
|
ffpp.check_version()
|
||||||
|
|
||||||
retval = subprocess.call(cmd)
|
args = [
|
||||||
|
encodeArgument(opt)
|
||||||
|
for opt in (ffpp.executable, '-y', '-i', url, '-f', 'mp4', '-c', 'copy', '-bsf:a', 'aac_adtstoasc')]
|
||||||
|
args.append(encodeFilename(tmpfilename, True))
|
||||||
|
|
||||||
|
retval = subprocess.call(args)
|
||||||
if retval == 0:
|
if retval == 0:
|
||||||
fsize = os.path.getsize(encodeFilename(tmpfilename))
|
fsize = os.path.getsize(encodeFilename(tmpfilename))
|
||||||
self.to_screen('\r[%s] %s bytes' % (cmd[0], fsize))
|
self.to_screen('\r[%s] %s bytes' % (args[0], fsize))
|
||||||
self.try_rename(tmpfilename, filename)
|
self.try_rename(tmpfilename, filename)
|
||||||
self._hook_progress({
|
self._hook_progress({
|
||||||
'downloaded_bytes': fsize,
|
'downloaded_bytes': fsize,
|
||||||
@ -46,7 +47,7 @@ class HlsFD(FileDownloader):
|
|||||||
return True
|
return True
|
||||||
else:
|
else:
|
||||||
self.to_stderr('\n')
|
self.to_stderr('\n')
|
||||||
self.report_error('%s exited with code %d' % (program, retval))
|
self.report_error('%s exited with code %d' % (ffpp.basename, retval))
|
||||||
return False
|
return False
|
||||||
|
|
||||||
|
|
||||||
|
@ -1,17 +1,19 @@
|
|||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
import errno
|
||||||
import os
|
import os
|
||||||
|
import socket
|
||||||
import time
|
import time
|
||||||
|
|
||||||
from .common import FileDownloader
|
from .common import FileDownloader
|
||||||
from ..utils import (
|
from ..compat import (
|
||||||
compat_urllib_request,
|
compat_urllib_request,
|
||||||
compat_urllib_error,
|
compat_urllib_error,
|
||||||
|
)
|
||||||
|
from ..utils import (
|
||||||
ContentTooShortError,
|
ContentTooShortError,
|
||||||
|
|
||||||
encodeFilename,
|
encodeFilename,
|
||||||
sanitize_open,
|
sanitize_open,
|
||||||
format_bytes,
|
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
@ -23,10 +25,6 @@ class HttpFD(FileDownloader):
|
|||||||
|
|
||||||
# Do not include the Accept-Encoding header
|
# Do not include the Accept-Encoding header
|
||||||
headers = {'Youtubedl-no-compression': 'True'}
|
headers = {'Youtubedl-no-compression': 'True'}
|
||||||
if 'user_agent' in info_dict:
|
|
||||||
headers['Youtubedl-user-agent'] = info_dict['user_agent']
|
|
||||||
if 'http_referer' in info_dict:
|
|
||||||
headers['Referer'] = info_dict['http_referer']
|
|
||||||
add_headers = info_dict.get('http_headers')
|
add_headers = info_dict.get('http_headers')
|
||||||
if add_headers:
|
if add_headers:
|
||||||
headers.update(add_headers)
|
headers.update(add_headers)
|
||||||
@ -102,6 +100,11 @@ class HttpFD(FileDownloader):
|
|||||||
resume_len = 0
|
resume_len = 0
|
||||||
open_mode = 'wb'
|
open_mode = 'wb'
|
||||||
break
|
break
|
||||||
|
except socket.error as e:
|
||||||
|
if e.errno != errno.ECONNRESET:
|
||||||
|
# Connection reset is no problem, just retry
|
||||||
|
raise
|
||||||
|
|
||||||
# Retry
|
# Retry
|
||||||
count += 1
|
count += 1
|
||||||
if count <= retries:
|
if count <= retries:
|
||||||
@ -132,20 +135,24 @@ class HttpFD(FileDownloader):
|
|||||||
self.to_screen('\r[download] File is larger than max-filesize (%s bytes > %s bytes). Aborting.' % (data_len, max_data_len))
|
self.to_screen('\r[download] File is larger than max-filesize (%s bytes > %s bytes). Aborting.' % (data_len, max_data_len))
|
||||||
return False
|
return False
|
||||||
|
|
||||||
data_len_str = format_bytes(data_len)
|
|
||||||
byte_counter = 0 + resume_len
|
byte_counter = 0 + resume_len
|
||||||
block_size = self.params.get('buffersize', 1024)
|
block_size = self.params.get('buffersize', 1024)
|
||||||
start = time.time()
|
start = time.time()
|
||||||
|
|
||||||
|
# measure time over whole while-loop, so slow_down() and best_block_size() work together properly
|
||||||
|
now = None # needed for slow_down() in the first loop run
|
||||||
|
before = start # start measuring
|
||||||
while True:
|
while True:
|
||||||
|
|
||||||
# Download and write
|
# Download and write
|
||||||
before = time.time()
|
|
||||||
data_block = data.read(block_size if not is_test else min(block_size, data_len - byte_counter))
|
data_block = data.read(block_size if not is_test else min(block_size, data_len - byte_counter))
|
||||||
after = time.time()
|
|
||||||
if len(data_block) == 0:
|
|
||||||
break
|
|
||||||
byte_counter += len(data_block)
|
byte_counter += len(data_block)
|
||||||
|
|
||||||
# Open file just in time
|
# exit loop when download is finished
|
||||||
|
if len(data_block) == 0:
|
||||||
|
break
|
||||||
|
|
||||||
|
# Open destination file just in time
|
||||||
if stream is None:
|
if stream is None:
|
||||||
try:
|
try:
|
||||||
(stream, tmpfilename) = sanitize_open(tmpfilename, open_mode)
|
(stream, tmpfilename) = sanitize_open(tmpfilename, open_mode)
|
||||||
@ -155,47 +162,68 @@ class HttpFD(FileDownloader):
|
|||||||
except (OSError, IOError) as err:
|
except (OSError, IOError) as err:
|
||||||
self.report_error('unable to open for writing: %s' % str(err))
|
self.report_error('unable to open for writing: %s' % str(err))
|
||||||
return False
|
return False
|
||||||
|
|
||||||
|
if self.params.get('xattr_set_filesize', False) and data_len is not None:
|
||||||
|
try:
|
||||||
|
import xattr
|
||||||
|
xattr.setxattr(tmpfilename, 'user.ytdl.filesize', str(data_len))
|
||||||
|
except(OSError, IOError, ImportError) as err:
|
||||||
|
self.report_error('unable to set filesize xattr: %s' % str(err))
|
||||||
|
|
||||||
try:
|
try:
|
||||||
stream.write(data_block)
|
stream.write(data_block)
|
||||||
except (IOError, OSError) as err:
|
except (IOError, OSError) as err:
|
||||||
self.to_stderr('\n')
|
self.to_stderr('\n')
|
||||||
self.report_error('unable to write data: %s' % str(err))
|
self.report_error('unable to write data: %s' % str(err))
|
||||||
return False
|
return False
|
||||||
|
|
||||||
|
# Apply rate limit
|
||||||
|
self.slow_down(start, now, byte_counter - resume_len)
|
||||||
|
|
||||||
|
# end measuring of one loop run
|
||||||
|
now = time.time()
|
||||||
|
after = now
|
||||||
|
|
||||||
|
# Adjust block size
|
||||||
if not self.params.get('noresizebuffer', False):
|
if not self.params.get('noresizebuffer', False):
|
||||||
block_size = self.best_block_size(after - before, len(data_block))
|
block_size = self.best_block_size(after - before, len(data_block))
|
||||||
|
|
||||||
|
before = after
|
||||||
|
|
||||||
# Progress message
|
# Progress message
|
||||||
speed = self.calc_speed(start, time.time(), byte_counter - resume_len)
|
speed = self.calc_speed(start, now, byte_counter - resume_len)
|
||||||
if data_len is None:
|
if data_len is None:
|
||||||
eta = percent = None
|
eta = None
|
||||||
else:
|
else:
|
||||||
percent = self.calc_percent(byte_counter, data_len)
|
|
||||||
eta = self.calc_eta(start, time.time(), data_len - resume_len, byte_counter - resume_len)
|
eta = self.calc_eta(start, time.time(), data_len - resume_len, byte_counter - resume_len)
|
||||||
self.report_progress(percent, data_len_str, speed, eta)
|
|
||||||
|
|
||||||
self._hook_progress({
|
self._hook_progress({
|
||||||
|
'status': 'downloading',
|
||||||
'downloaded_bytes': byte_counter,
|
'downloaded_bytes': byte_counter,
|
||||||
'total_bytes': data_len,
|
'total_bytes': data_len,
|
||||||
'tmpfilename': tmpfilename,
|
'tmpfilename': tmpfilename,
|
||||||
'filename': filename,
|
'filename': filename,
|
||||||
'status': 'downloading',
|
|
||||||
'eta': eta,
|
'eta': eta,
|
||||||
'speed': speed,
|
'speed': speed,
|
||||||
|
'elapsed': now - start,
|
||||||
})
|
})
|
||||||
|
|
||||||
if is_test and byte_counter == data_len:
|
if is_test and byte_counter == data_len:
|
||||||
break
|
break
|
||||||
|
|
||||||
# Apply rate limit
|
|
||||||
self.slow_down(start, byte_counter - resume_len)
|
|
||||||
|
|
||||||
if stream is None:
|
if stream is None:
|
||||||
self.to_stderr('\n')
|
self.to_stderr('\n')
|
||||||
self.report_error('Did not get any data blocks')
|
self.report_error('Did not get any data blocks')
|
||||||
return False
|
return False
|
||||||
if tmpfilename != '-':
|
if tmpfilename != '-':
|
||||||
stream.close()
|
stream.close()
|
||||||
self.report_finish(data_len_str, (time.time() - start))
|
|
||||||
|
self._hook_progress({
|
||||||
|
'downloaded_bytes': byte_counter,
|
||||||
|
'total_bytes': data_len,
|
||||||
|
'tmpfilename': tmpfilename,
|
||||||
|
'status': 'error',
|
||||||
|
})
|
||||||
if data_len is not None and byte_counter != data_len:
|
if data_len is not None and byte_counter != data_len:
|
||||||
raise ContentTooShortError(byte_counter, int(data_len))
|
raise ContentTooShortError(byte_counter, int(data_len))
|
||||||
self.try_rename(tmpfilename, filename)
|
self.try_rename(tmpfilename, filename)
|
||||||
@ -209,6 +237,7 @@ class HttpFD(FileDownloader):
|
|||||||
'total_bytes': byte_counter,
|
'total_bytes': byte_counter,
|
||||||
'filename': filename,
|
'filename': filename,
|
||||||
'status': 'finished',
|
'status': 'finished',
|
||||||
|
'elapsed': time.time() - start,
|
||||||
})
|
})
|
||||||
|
|
||||||
return True
|
return True
|
||||||
|
@ -4,8 +4,8 @@ import os
|
|||||||
import subprocess
|
import subprocess
|
||||||
|
|
||||||
from .common import FileDownloader
|
from .common import FileDownloader
|
||||||
from ..compat import compat_subprocess_get_DEVNULL
|
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
|
check_executable,
|
||||||
encodeFilename,
|
encodeFilename,
|
||||||
)
|
)
|
||||||
|
|
||||||
@ -20,11 +20,7 @@ class MplayerFD(FileDownloader):
|
|||||||
'mplayer', '-really-quiet', '-vo', 'null', '-vc', 'dummy',
|
'mplayer', '-really-quiet', '-vo', 'null', '-vc', 'dummy',
|
||||||
'-dumpstream', '-dumpfile', tmpfilename, url]
|
'-dumpstream', '-dumpfile', tmpfilename, url]
|
||||||
# Check for mplayer first
|
# Check for mplayer first
|
||||||
try:
|
if not check_executable('mplayer', ['-h']):
|
||||||
subprocess.call(
|
|
||||||
['mplayer', '-h'],
|
|
||||||
stdout=compat_subprocess_get_DEVNULL(), stderr=subprocess.STDOUT)
|
|
||||||
except (OSError, IOError):
|
|
||||||
self.report_error('MMS or RTSP download detected but "%s" could not be run' % args[0])
|
self.report_error('MMS or RTSP download detected but "%s" could not be run' % args[0])
|
||||||
return False
|
return False
|
||||||
|
|
||||||
|
@ -7,11 +7,10 @@ import sys
|
|||||||
import time
|
import time
|
||||||
|
|
||||||
from .common import FileDownloader
|
from .common import FileDownloader
|
||||||
|
from ..compat import compat_str
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
check_executable,
|
check_executable,
|
||||||
compat_str,
|
|
||||||
encodeFilename,
|
encodeFilename,
|
||||||
format_bytes,
|
|
||||||
get_exe_version,
|
get_exe_version,
|
||||||
)
|
)
|
||||||
|
|
||||||
@ -51,23 +50,23 @@ class RtmpFD(FileDownloader):
|
|||||||
if not resume_percent:
|
if not resume_percent:
|
||||||
resume_percent = percent
|
resume_percent = percent
|
||||||
resume_downloaded_data_len = downloaded_data_len
|
resume_downloaded_data_len = downloaded_data_len
|
||||||
eta = self.calc_eta(start, time.time(), 100 - resume_percent, percent - resume_percent)
|
time_now = time.time()
|
||||||
speed = self.calc_speed(start, time.time(), downloaded_data_len - resume_downloaded_data_len)
|
eta = self.calc_eta(start, time_now, 100 - resume_percent, percent - resume_percent)
|
||||||
|
speed = self.calc_speed(start, time_now, downloaded_data_len - resume_downloaded_data_len)
|
||||||
data_len = None
|
data_len = None
|
||||||
if percent > 0:
|
if percent > 0:
|
||||||
data_len = int(downloaded_data_len * 100 / percent)
|
data_len = int(downloaded_data_len * 100 / percent)
|
||||||
data_len_str = '~' + format_bytes(data_len)
|
|
||||||
self.report_progress(percent, data_len_str, speed, eta)
|
|
||||||
cursor_in_new_line = False
|
|
||||||
self._hook_progress({
|
self._hook_progress({
|
||||||
|
'status': 'downloading',
|
||||||
'downloaded_bytes': downloaded_data_len,
|
'downloaded_bytes': downloaded_data_len,
|
||||||
'total_bytes': data_len,
|
'total_bytes_estimate': data_len,
|
||||||
'tmpfilename': tmpfilename,
|
'tmpfilename': tmpfilename,
|
||||||
'filename': filename,
|
'filename': filename,
|
||||||
'status': 'downloading',
|
|
||||||
'eta': eta,
|
'eta': eta,
|
||||||
|
'elapsed': time_now - start,
|
||||||
'speed': speed,
|
'speed': speed,
|
||||||
})
|
})
|
||||||
|
cursor_in_new_line = False
|
||||||
else:
|
else:
|
||||||
# no percent for live streams
|
# no percent for live streams
|
||||||
mobj = re.search(r'([0-9]+\.[0-9]{3}) kB / [0-9]+\.[0-9]{2} sec', line)
|
mobj = re.search(r'([0-9]+\.[0-9]{3}) kB / [0-9]+\.[0-9]{2} sec', line)
|
||||||
@ -75,15 +74,15 @@ class RtmpFD(FileDownloader):
|
|||||||
downloaded_data_len = int(float(mobj.group(1)) * 1024)
|
downloaded_data_len = int(float(mobj.group(1)) * 1024)
|
||||||
time_now = time.time()
|
time_now = time.time()
|
||||||
speed = self.calc_speed(start, time_now, downloaded_data_len)
|
speed = self.calc_speed(start, time_now, downloaded_data_len)
|
||||||
self.report_progress_live_stream(downloaded_data_len, speed, time_now - start)
|
|
||||||
cursor_in_new_line = False
|
|
||||||
self._hook_progress({
|
self._hook_progress({
|
||||||
'downloaded_bytes': downloaded_data_len,
|
'downloaded_bytes': downloaded_data_len,
|
||||||
'tmpfilename': tmpfilename,
|
'tmpfilename': tmpfilename,
|
||||||
'filename': filename,
|
'filename': filename,
|
||||||
'status': 'downloading',
|
'status': 'downloading',
|
||||||
|
'elapsed': time_now - start,
|
||||||
'speed': speed,
|
'speed': speed,
|
||||||
})
|
})
|
||||||
|
cursor_in_new_line = False
|
||||||
elif self.params.get('verbose', False):
|
elif self.params.get('verbose', False):
|
||||||
if not cursor_in_new_line:
|
if not cursor_in_new_line:
|
||||||
self.to_screen('')
|
self.to_screen('')
|
||||||
@ -104,6 +103,9 @@ class RtmpFD(FileDownloader):
|
|||||||
live = info_dict.get('rtmp_live', False)
|
live = info_dict.get('rtmp_live', False)
|
||||||
conn = info_dict.get('rtmp_conn', None)
|
conn = info_dict.get('rtmp_conn', None)
|
||||||
protocol = info_dict.get('rtmp_protocol', None)
|
protocol = info_dict.get('rtmp_protocol', None)
|
||||||
|
real_time = info_dict.get('rtmp_real_time', False)
|
||||||
|
no_resume = info_dict.get('no_resume', False)
|
||||||
|
continue_dl = info_dict.get('continuedl', False)
|
||||||
|
|
||||||
self.report_destination(filename)
|
self.report_destination(filename)
|
||||||
tmpfilename = self.temp_name(filename)
|
tmpfilename = self.temp_name(filename)
|
||||||
@ -141,7 +143,14 @@ class RtmpFD(FileDownloader):
|
|||||||
basic_args += ['--conn', conn]
|
basic_args += ['--conn', conn]
|
||||||
if protocol is not None:
|
if protocol is not None:
|
||||||
basic_args += ['--protocol', protocol]
|
basic_args += ['--protocol', protocol]
|
||||||
args = basic_args + [[], ['--resume', '--skip', '1']][not live and self.params.get('continuedl', False)]
|
if real_time:
|
||||||
|
basic_args += ['--realtime']
|
||||||
|
|
||||||
|
args = basic_args
|
||||||
|
if not no_resume and continue_dl and not live:
|
||||||
|
args += ['--resume']
|
||||||
|
if not live and continue_dl:
|
||||||
|
args += ['--skip', '1']
|
||||||
|
|
||||||
if sys.platform == 'win32' and sys.version_info < (3, 0):
|
if sys.platform == 'win32' and sys.version_info < (3, 0):
|
||||||
# Windows subprocess module does not actually support Unicode
|
# Windows subprocess module does not actually support Unicode
|
||||||
@ -152,19 +161,7 @@ class RtmpFD(FileDownloader):
|
|||||||
else:
|
else:
|
||||||
subprocess_encoding = None
|
subprocess_encoding = None
|
||||||
|
|
||||||
if self.params.get('verbose', False):
|
self._debug_cmd(args, subprocess_encoding, exe='rtmpdump')
|
||||||
if subprocess_encoding:
|
|
||||||
str_args = [
|
|
||||||
a.decode(subprocess_encoding) if isinstance(a, bytes) else a
|
|
||||||
for a in args]
|
|
||||||
else:
|
|
||||||
str_args = args
|
|
||||||
try:
|
|
||||||
import pipes
|
|
||||||
shell_quote = lambda args: ' '.join(map(pipes.quote, str_args))
|
|
||||||
except ImportError:
|
|
||||||
shell_quote = repr
|
|
||||||
self.to_screen('[debug] rtmpdump command line: ' + shell_quote(str_args))
|
|
||||||
|
|
||||||
RD_SUCCESS = 0
|
RD_SUCCESS = 0
|
||||||
RD_FAILED = 1
|
RD_FAILED = 1
|
||||||
@ -185,7 +182,7 @@ class RtmpFD(FileDownloader):
|
|||||||
cursize = os.path.getsize(encodeFilename(tmpfilename))
|
cursize = os.path.getsize(encodeFilename(tmpfilename))
|
||||||
if prevsize == cursize and retval == RD_FAILED:
|
if prevsize == cursize and retval == RD_FAILED:
|
||||||
break
|
break
|
||||||
# Some rtmp streams seem abort after ~ 99.8%. Don't complain for those
|
# Some rtmp streams seem abort after ~ 99.8%. Don't complain for those
|
||||||
if prevsize == cursize and retval == RD_INCOMPLETE and cursize > 1024:
|
if prevsize == cursize and retval == RD_INCOMPLETE and cursize > 1024:
|
||||||
self.to_screen('[rtmpdump] Could not download the whole video. This can happen for some advertisements.')
|
self.to_screen('[rtmpdump] Could not download the whole video. This can happen for some advertisements.')
|
||||||
retval = RD_SUCCESS
|
retval = RD_SUCCESS
|
||||||
|
@ -1,10 +1,15 @@
|
|||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
from .abc import ABCIE
|
from .abc import ABCIE
|
||||||
|
from .abc7news import Abc7NewsIE
|
||||||
from .academicearth import AcademicEarthCourseIE
|
from .academicearth import AcademicEarthCourseIE
|
||||||
from .addanime import AddAnimeIE
|
from .addanime import AddAnimeIE
|
||||||
|
from .adobetv import AdobeTVIE
|
||||||
from .adultswim import AdultSwimIE
|
from .adultswim import AdultSwimIE
|
||||||
|
from .aftenposten import AftenpostenIE
|
||||||
from .aftonbladet import AftonbladetIE
|
from .aftonbladet import AftonbladetIE
|
||||||
|
from .aljazeera import AlJazeeraIE
|
||||||
|
from .alphaporno import AlphaPornoIE
|
||||||
from .anitube import AnitubeIE
|
from .anitube import AnitubeIE
|
||||||
from .anysex import AnySexIE
|
from .anysex import AnySexIE
|
||||||
from .aol import AolIE
|
from .aol import AolIE
|
||||||
@ -22,13 +27,16 @@ from .arte import (
|
|||||||
ArteTVDDCIE,
|
ArteTVDDCIE,
|
||||||
ArteTVEmbedIE,
|
ArteTVEmbedIE,
|
||||||
)
|
)
|
||||||
from .audiomack import AudiomackIE
|
from .atresplayer import AtresPlayerIE
|
||||||
from .auengine import AUEngineIE
|
from .atttechchannel import ATTTechChannelIE
|
||||||
|
from .audiomack import AudiomackIE, AudiomackAlbumIE
|
||||||
|
from .azubu import AzubuIE
|
||||||
from .bambuser import BambuserIE, BambuserChannelIE
|
from .bambuser import BambuserIE, BambuserChannelIE
|
||||||
from .bandcamp import BandcampIE, BandcampAlbumIE
|
from .bandcamp import BandcampIE, BandcampAlbumIE
|
||||||
from .bbccouk import BBCCoUkIE
|
from .bbccouk import BBCCoUkIE
|
||||||
from .beeg import BeegIE
|
from .beeg import BeegIE
|
||||||
from .behindkink import BehindKinkIE
|
from .behindkink import BehindKinkIE
|
||||||
|
from .bet import BetIE
|
||||||
from .bild import BildIE
|
from .bild import BildIE
|
||||||
from .bilibili import BiliBiliIE
|
from .bilibili import BiliBiliIE
|
||||||
from .blinkx import BlinkxIE
|
from .blinkx import BlinkxIE
|
||||||
@ -41,15 +49,21 @@ from .brightcove import BrightcoveIE
|
|||||||
from .buzzfeed import BuzzFeedIE
|
from .buzzfeed import BuzzFeedIE
|
||||||
from .byutv import BYUtvIE
|
from .byutv import BYUtvIE
|
||||||
from .c56 import C56IE
|
from .c56 import C56IE
|
||||||
|
from .camdemy import (
|
||||||
|
CamdemyIE,
|
||||||
|
CamdemyFolderIE
|
||||||
|
)
|
||||||
from .canal13cl import Canal13clIE
|
from .canal13cl import Canal13clIE
|
||||||
from .canalplus import CanalplusIE
|
from .canalplus import CanalplusIE
|
||||||
from .canalc2 import Canalc2IE
|
from .canalc2 import Canalc2IE
|
||||||
from .cbs import CBSIE
|
from .cbs import CBSIE
|
||||||
from .cbsnews import CBSNewsIE
|
from .cbsnews import CBSNewsIE
|
||||||
|
from .cbssports import CBSSportsIE
|
||||||
|
from .ccc import CCCIE
|
||||||
from .ceskatelevize import CeskaTelevizeIE
|
from .ceskatelevize import CeskaTelevizeIE
|
||||||
from .channel9 import Channel9IE
|
from .channel9 import Channel9IE
|
||||||
from .chilloutzone import ChilloutzoneIE
|
from .chilloutzone import ChilloutzoneIE
|
||||||
from .cinemassacre import CinemassacreIE
|
from .cinchcast import CinchcastIE
|
||||||
from .clipfish import ClipfishIE
|
from .clipfish import ClipfishIE
|
||||||
from .cliphunter import CliphunterIE
|
from .cliphunter import CliphunterIE
|
||||||
from .clipsyndicate import ClipsyndicateIE
|
from .clipsyndicate import ClipsyndicateIE
|
||||||
@ -60,9 +74,13 @@ from .cnet import CNETIE
|
|||||||
from .cnn import (
|
from .cnn import (
|
||||||
CNNIE,
|
CNNIE,
|
||||||
CNNBlogsIE,
|
CNNBlogsIE,
|
||||||
|
CNNArticleIE,
|
||||||
)
|
)
|
||||||
from .collegehumor import CollegeHumorIE
|
from .collegehumor import CollegeHumorIE
|
||||||
|
from .collegerama import CollegeRamaIE
|
||||||
from .comedycentral import ComedyCentralIE, ComedyCentralShowsIE
|
from .comedycentral import ComedyCentralIE, ComedyCentralShowsIE
|
||||||
|
from .comcarcoff import ComCarCoffIE
|
||||||
|
from .commonmistakes import CommonMistakesIE, UnicodeBOMIE
|
||||||
from .condenast import CondeNastIE
|
from .condenast import CondeNastIE
|
||||||
from .cracked import CrackedIE
|
from .cracked import CrackedIE
|
||||||
from .criterion import CriterionIE
|
from .criterion import CriterionIE
|
||||||
@ -71,6 +89,7 @@ from .crunchyroll import (
|
|||||||
CrunchyrollShowPlaylistIE
|
CrunchyrollShowPlaylistIE
|
||||||
)
|
)
|
||||||
from .cspan import CSpanIE
|
from .cspan import CSpanIE
|
||||||
|
from .ctsnews import CtsNewsIE
|
||||||
from .dailymotion import (
|
from .dailymotion import (
|
||||||
DailymotionIE,
|
DailymotionIE,
|
||||||
DailymotionPlaylistIE,
|
DailymotionPlaylistIE,
|
||||||
@ -78,18 +97,22 @@ from .dailymotion import (
|
|||||||
)
|
)
|
||||||
from .daum import DaumIE
|
from .daum import DaumIE
|
||||||
from .dbtv import DBTVIE
|
from .dbtv import DBTVIE
|
||||||
|
from .dctp import DctpTvIE
|
||||||
from .deezer import DeezerPlaylistIE
|
from .deezer import DeezerPlaylistIE
|
||||||
from .dfb import DFBIE
|
from .dfb import DFBIE
|
||||||
from .dotsub import DotsubIE
|
from .dotsub import DotsubIE
|
||||||
from .dreisat import DreiSatIE
|
from .dreisat import DreiSatIE
|
||||||
|
from .drbonanza import DRBonanzaIE
|
||||||
from .drtuber import DrTuberIE
|
from .drtuber import DrTuberIE
|
||||||
from .drtv import DRTVIE
|
from .drtv import DRTVIE
|
||||||
|
from .dvtv import DVTVIE
|
||||||
from .dump import DumpIE
|
from .dump import DumpIE
|
||||||
from .defense import DefenseGouvFrIE
|
from .defense import DefenseGouvFrIE
|
||||||
from .discovery import DiscoveryIE
|
from .discovery import DiscoveryIE
|
||||||
from .divxstage import DivxStageIE
|
from .divxstage import DivxStageIE
|
||||||
from .dropbox import DropboxIE
|
from .dropbox import DropboxIE
|
||||||
from .ebaumsworld import EbaumsWorldIE
|
from .ebaumsworld import EbaumsWorldIE
|
||||||
|
from .echomsk import EchoMskIE
|
||||||
from .ehow import EHowIE
|
from .ehow import EHowIE
|
||||||
from .eighttracks import EightTracksIE
|
from .eighttracks import EightTracksIE
|
||||||
from .einthusan import EinthusanIE
|
from .einthusan import EinthusanIE
|
||||||
@ -99,9 +122,11 @@ from .ellentv import (
|
|||||||
EllenTVClipsIE,
|
EllenTVClipsIE,
|
||||||
)
|
)
|
||||||
from .elpais import ElPaisIE
|
from .elpais import ElPaisIE
|
||||||
|
from .embedly import EmbedlyIE
|
||||||
from .empflix import EMPFlixIE
|
from .empflix import EMPFlixIE
|
||||||
from .engadget import EngadgetIE
|
from .engadget import EngadgetIE
|
||||||
from .eporner import EpornerIE
|
from .eporner import EpornerIE
|
||||||
|
from .eroprofile import EroProfileIE
|
||||||
from .escapist import EscapistIE
|
from .escapist import EscapistIE
|
||||||
from .everyonesmixtape import EveryonesMixtapeIE
|
from .everyonesmixtape import EveryonesMixtapeIE
|
||||||
from .exfm import ExfmIE
|
from .exfm import ExfmIE
|
||||||
@ -121,6 +146,8 @@ from .fktv import (
|
|||||||
from .flickr import FlickrIE
|
from .flickr import FlickrIE
|
||||||
from .folketinget import FolketingetIE
|
from .folketinget import FolketingetIE
|
||||||
from .fourtube import FourTubeIE
|
from .fourtube import FourTubeIE
|
||||||
|
from .foxgay import FoxgayIE
|
||||||
|
from .foxnews import FoxNewsIE
|
||||||
from .franceculture import FranceCultureIE
|
from .franceculture import FranceCultureIE
|
||||||
from .franceinter import FranceInterIE
|
from .franceinter import FranceInterIE
|
||||||
from .francetv import (
|
from .francetv import (
|
||||||
@ -144,6 +171,8 @@ from .gamestar import GameStarIE
|
|||||||
from .gametrailers import GametrailersIE
|
from .gametrailers import GametrailersIE
|
||||||
from .gdcvault import GDCVaultIE
|
from .gdcvault import GDCVaultIE
|
||||||
from .generic import GenericIE
|
from .generic import GenericIE
|
||||||
|
from .giantbomb import GiantBombIE
|
||||||
|
from .giga import GigaIE
|
||||||
from .glide import GlideIE
|
from .glide import GlideIE
|
||||||
from .globo import GloboIE
|
from .globo import GloboIE
|
||||||
from .godtube import GodTubeIE
|
from .godtube import GodTubeIE
|
||||||
@ -154,10 +183,16 @@ from .googlesearch import GoogleSearchIE
|
|||||||
from .gorillavid import GorillaVidIE
|
from .gorillavid import GorillaVidIE
|
||||||
from .goshgay import GoshgayIE
|
from .goshgay import GoshgayIE
|
||||||
from .grooveshark import GroovesharkIE
|
from .grooveshark import GroovesharkIE
|
||||||
|
from .groupon import GrouponIE
|
||||||
from .hark import HarkIE
|
from .hark import HarkIE
|
||||||
|
from .hearthisat import HearThisAtIE
|
||||||
from .heise import HeiseIE
|
from .heise import HeiseIE
|
||||||
|
from .hellporno import HellPornoIE
|
||||||
from .helsinki import HelsinkiIE
|
from .helsinki import HelsinkiIE
|
||||||
from .hentaistigma import HentaiStigmaIE
|
from .hentaistigma import HentaiStigmaIE
|
||||||
|
from .historicfilms import HistoricFilmsIE
|
||||||
|
from .history import HistoryIE
|
||||||
|
from .hitbox import HitboxIE, HitboxLiveIE
|
||||||
from .hornbunny import HornBunnyIE
|
from .hornbunny import HornBunnyIE
|
||||||
from .hostingbulk import HostingBulkIE
|
from .hostingbulk import HostingBulkIE
|
||||||
from .hotnewhiphop import HotNewHipHopIE
|
from .hotnewhiphop import HotNewHipHopIE
|
||||||
@ -171,6 +206,7 @@ from .imdb import (
|
|||||||
ImdbIE,
|
ImdbIE,
|
||||||
ImdbListIE
|
ImdbListIE
|
||||||
)
|
)
|
||||||
|
from .imgur import ImgurIE
|
||||||
from .ina import InaIE
|
from .ina import InaIE
|
||||||
from .infoq import InfoQIE
|
from .infoq import InfoQIE
|
||||||
from .instagram import InstagramIE, InstagramUserIE
|
from .instagram import InstagramIE, InstagramUserIE
|
||||||
@ -187,6 +223,7 @@ from .jove import JoveIE
|
|||||||
from .jukebox import JukeboxIE
|
from .jukebox import JukeboxIE
|
||||||
from .jpopsukitv import JpopsukiIE
|
from .jpopsukitv import JpopsukiIE
|
||||||
from .kankan import KankanIE
|
from .kankan import KankanIE
|
||||||
|
from .karaoketv import KaraoketvIE
|
||||||
from .keezmovies import KeezMoviesIE
|
from .keezmovies import KeezMoviesIE
|
||||||
from .khanacademy import KhanAcademyIE
|
from .khanacademy import KhanAcademyIE
|
||||||
from .kickstarter import KickStarterIE
|
from .kickstarter import KickStarterIE
|
||||||
@ -203,6 +240,7 @@ from .livestream import (
|
|||||||
LivestreamOriginalIE,
|
LivestreamOriginalIE,
|
||||||
LivestreamShortenerIE,
|
LivestreamShortenerIE,
|
||||||
)
|
)
|
||||||
|
from .lnkgo import LnkGoIE
|
||||||
from .lrt import LRTIE
|
from .lrt import LRTIE
|
||||||
from .lynda import (
|
from .lynda import (
|
||||||
LyndaIE,
|
LyndaIE,
|
||||||
@ -216,6 +254,7 @@ from .mdr import MDRIE
|
|||||||
from .metacafe import MetacafeIE
|
from .metacafe import MetacafeIE
|
||||||
from .metacritic import MetacriticIE
|
from .metacritic import MetacriticIE
|
||||||
from .mgoon import MgoonIE
|
from .mgoon import MgoonIE
|
||||||
|
from .minhateca import MinhatecaIE
|
||||||
from .ministrygrid import MinistryGridIE
|
from .ministrygrid import MinistryGridIE
|
||||||
from .mit import TechTVMITIE, MITIE, OCWMITIE
|
from .mit import TechTVMITIE, MITIE, OCWMITIE
|
||||||
from .mitele import MiTeleIE
|
from .mitele import MiTeleIE
|
||||||
@ -242,9 +281,11 @@ from .muenchentv import MuenchenTVIE
|
|||||||
from .musicplayon import MusicPlayOnIE
|
from .musicplayon import MusicPlayOnIE
|
||||||
from .musicvault import MusicVaultIE
|
from .musicvault import MusicVaultIE
|
||||||
from .muzu import MuzuTVIE
|
from .muzu import MuzuTVIE
|
||||||
from .myspace import MySpaceIE
|
from .myspace import MySpaceIE, MySpaceAlbumIE
|
||||||
from .myspass import MySpassIE
|
from .myspass import MySpassIE
|
||||||
from .myvideo import MyVideoIE
|
from .myvideo import MyVideoIE
|
||||||
|
from .myvidster import MyVidsterIE
|
||||||
|
from .nationalgeographic import NationalGeographicIE
|
||||||
from .naver import NaverIE
|
from .naver import NaverIE
|
||||||
from .nba import NBAIE
|
from .nba import NBAIE
|
||||||
from .nbc import (
|
from .nbc import (
|
||||||
@ -253,11 +294,24 @@ from .nbc import (
|
|||||||
)
|
)
|
||||||
from .ndr import NDRIE
|
from .ndr import NDRIE
|
||||||
from .ndtv import NDTVIE
|
from .ndtv import NDTVIE
|
||||||
|
from .netzkino import NetzkinoIE
|
||||||
|
from .nerdcubed import NerdCubedFeedIE
|
||||||
|
from .nerdist import NerdistIE
|
||||||
from .newgrounds import NewgroundsIE
|
from .newgrounds import NewgroundsIE
|
||||||
from .newstube import NewstubeIE
|
from .newstube import NewstubeIE
|
||||||
|
from .nextmedia import (
|
||||||
|
NextMediaIE,
|
||||||
|
NextMediaActionNewsIE,
|
||||||
|
AppleDailyRealtimeNewsIE,
|
||||||
|
AppleDailyAnimationNewsIE
|
||||||
|
)
|
||||||
from .nfb import NFBIE
|
from .nfb import NFBIE
|
||||||
from .nfl import NFLIE
|
from .nfl import NFLIE
|
||||||
from .nhl import NHLIE, NHLVideocenterIE
|
from .nhl import (
|
||||||
|
NHLIE,
|
||||||
|
NHLNewsIE,
|
||||||
|
NHLVideocenterIE,
|
||||||
|
)
|
||||||
from .niconico import NiconicoIE, NiconicoPlaylistIE
|
from .niconico import NiconicoIE, NiconicoPlaylistIE
|
||||||
from .ninegag import NineGagIE
|
from .ninegag import NineGagIE
|
||||||
from .noco import NocoIE
|
from .noco import NocoIE
|
||||||
@ -268,17 +322,22 @@ from .nowness import NownessIE
|
|||||||
from .nowvideo import NowVideoIE
|
from .nowvideo import NowVideoIE
|
||||||
from .npo import (
|
from .npo import (
|
||||||
NPOIE,
|
NPOIE,
|
||||||
|
NPOLiveIE,
|
||||||
|
NPORadioIE,
|
||||||
|
NPORadioFragmentIE,
|
||||||
TegenlichtVproIE,
|
TegenlichtVproIE,
|
||||||
)
|
)
|
||||||
from .nrk import (
|
from .nrk import (
|
||||||
NRKIE,
|
NRKIE,
|
||||||
NRKTVIE,
|
NRKTVIE,
|
||||||
)
|
)
|
||||||
from .ntv import NTVIE
|
from .ntvde import NTVDeIE
|
||||||
|
from .ntvru import NTVRuIE
|
||||||
from .nytimes import NYTimesIE
|
from .nytimes import NYTimesIE
|
||||||
from .nuvid import NuvidIE
|
from .nuvid import NuvidIE
|
||||||
from .oktoberfesttv import OktoberfestTVIE
|
from .oktoberfesttv import OktoberfestTVIE
|
||||||
from .ooyala import OoyalaIE
|
from .ooyala import OoyalaIE
|
||||||
|
from .openfilm import OpenFilmIE
|
||||||
from .orf import (
|
from .orf import (
|
||||||
ORFTVthekIE,
|
ORFTVthekIE,
|
||||||
ORFOE1IE,
|
ORFOE1IE,
|
||||||
@ -295,40 +354,53 @@ from .playfm import PlayFMIE
|
|||||||
from .playvid import PlayvidIE
|
from .playvid import PlayvidIE
|
||||||
from .podomatic import PodomaticIE
|
from .podomatic import PodomaticIE
|
||||||
from .pornhd import PornHdIE
|
from .pornhd import PornHdIE
|
||||||
from .pornhub import PornHubIE
|
from .pornhub import (
|
||||||
|
PornHubIE,
|
||||||
|
PornHubPlaylistIE,
|
||||||
|
)
|
||||||
from .pornotube import PornotubeIE
|
from .pornotube import PornotubeIE
|
||||||
from .pornoxo import PornoXOIE
|
from .pornoxo import PornoXOIE
|
||||||
from .promptfile import PromptFileIE
|
from .promptfile import PromptFileIE
|
||||||
from .prosiebensat1 import ProSiebenSat1IE
|
from .prosiebensat1 import ProSiebenSat1IE
|
||||||
from .pyvideo import PyvideoIE
|
from .pyvideo import PyvideoIE
|
||||||
from .quickvid import QuickVidIE
|
from .quickvid import QuickVidIE
|
||||||
|
from .radiode import RadioDeIE
|
||||||
|
from .radiobremen import RadioBremenIE
|
||||||
from .radiofrance import RadioFranceIE
|
from .radiofrance import RadioFranceIE
|
||||||
from .rai import RaiIE
|
from .rai import RaiIE
|
||||||
from .rbmaradio import RBMARadioIE
|
from .rbmaradio import RBMARadioIE
|
||||||
from .redtube import RedTubeIE
|
from .redtube import RedTubeIE
|
||||||
|
from .restudy import RestudyIE
|
||||||
from .reverbnation import ReverbNationIE
|
from .reverbnation import ReverbNationIE
|
||||||
from .ringtv import RingTVIE
|
from .ringtv import RingTVIE
|
||||||
from .ro220 import Ro220IE
|
from .ro220 import Ro220IE
|
||||||
from .rottentomatoes import RottenTomatoesIE
|
from .rottentomatoes import RottenTomatoesIE
|
||||||
from .roxwel import RoxwelIE
|
from .roxwel import RoxwelIE
|
||||||
from .rtbf import RTBFIE
|
from .rtbf import RTBFIE
|
||||||
from .rtlnl import RtlXlIE
|
from .rte import RteIE
|
||||||
|
from .rtlnl import RtlNlIE
|
||||||
from .rtlnow import RTLnowIE
|
from .rtlnow import RTLnowIE
|
||||||
|
from .rtl2 import RTL2IE
|
||||||
|
from .rtp import RTPIE
|
||||||
from .rts import RTSIE
|
from .rts import RTSIE
|
||||||
from .rtve import RTVEALaCartaIE, RTVELiveIE
|
from .rtve import RTVEALaCartaIE, RTVELiveIE
|
||||||
from .ruhd import RUHDIE
|
from .ruhd import RUHDIE
|
||||||
from .rutube import (
|
from .rutube import (
|
||||||
RutubeIE,
|
RutubeIE,
|
||||||
RutubeChannelIE,
|
RutubeChannelIE,
|
||||||
|
RutubeEmbedIE,
|
||||||
RutubeMovieIE,
|
RutubeMovieIE,
|
||||||
RutubePersonIE,
|
RutubePersonIE,
|
||||||
)
|
)
|
||||||
from .rutv import RUTVIE
|
from .rutv import RUTVIE
|
||||||
|
from .sandia import SandiaIE
|
||||||
from .sapo import SapoIE
|
from .sapo import SapoIE
|
||||||
from .savefrom import SaveFromIE
|
from .savefrom import SaveFromIE
|
||||||
from .sbs import SBSIE
|
from .sbs import SBSIE
|
||||||
from .scivee import SciVeeIE
|
from .scivee import SciVeeIE
|
||||||
from .screencast import ScreencastIE
|
from .screencast import ScreencastIE
|
||||||
|
from .screencastomatic import ScreencastOMaticIE
|
||||||
|
from .screenwavemedia import CinemassacreIE, ScreenwaveMediaIE, TeamFourIE
|
||||||
from .servingsys import ServingSysIE
|
from .servingsys import ServingSysIE
|
||||||
from .sexu import SexuIE
|
from .sexu import SexuIE
|
||||||
from .sexykarma import SexyKarmaIE
|
from .sexykarma import SexyKarmaIE
|
||||||
@ -370,7 +442,9 @@ from .stanfordoc import StanfordOpenClassroomIE
|
|||||||
from .steam import SteamIE
|
from .steam import SteamIE
|
||||||
from .streamcloud import StreamcloudIE
|
from .streamcloud import StreamcloudIE
|
||||||
from .streamcz import StreamCZIE
|
from .streamcz import StreamCZIE
|
||||||
|
from .streetvoice import StreetVoiceIE
|
||||||
from .sunporno import SunPornoIE
|
from .sunporno import SunPornoIE
|
||||||
|
from .svtplay import SVTPlayIE
|
||||||
from .swrmediathek import SWRMediathekIE
|
from .swrmediathek import SWRMediathekIE
|
||||||
from .syfy import SyfyIE
|
from .syfy import SyfyIE
|
||||||
from .sztvhu import SztvHuIE
|
from .sztvhu import SztvHuIE
|
||||||
@ -388,8 +462,10 @@ from .ted import TEDIE
|
|||||||
from .telebruxelles import TeleBruxellesIE
|
from .telebruxelles import TeleBruxellesIE
|
||||||
from .telecinco import TelecincoIE
|
from .telecinco import TelecincoIE
|
||||||
from .telemb import TeleMBIE
|
from .telemb import TeleMBIE
|
||||||
|
from .teletask import TeleTaskIE
|
||||||
from .tenplay import TenPlayIE
|
from .tenplay import TenPlayIE
|
||||||
from .testurl import TestURLIE
|
from .testurl import TestURLIE
|
||||||
|
from .testtube import TestTubeIE
|
||||||
from .tf1 import TF1IE
|
from .tf1 import TF1IE
|
||||||
from .theonion import TheOnionIE
|
from .theonion import TheOnionIE
|
||||||
from .theplatform import ThePlatformIE
|
from .theplatform import ThePlatformIE
|
||||||
@ -414,10 +490,21 @@ from .tumblr import TumblrIE
|
|||||||
from .tunein import TuneInIE
|
from .tunein import TuneInIE
|
||||||
from .turbo import TurboIE
|
from .turbo import TurboIE
|
||||||
from .tutv import TutvIE
|
from .tutv import TutvIE
|
||||||
|
from .tv4 import TV4IE
|
||||||
from .tvigle import TvigleIE
|
from .tvigle import TvigleIE
|
||||||
from .tvp import TvpIE
|
from .tvp import TvpIE, TvpSeriesIE
|
||||||
from .tvplay import TVPlayIE
|
from .tvplay import TVPlayIE
|
||||||
from .twitch import TwitchIE
|
from .tweakers import TweakersIE
|
||||||
|
from .twentyfourvideo import TwentyFourVideoIE
|
||||||
|
from .twitch import (
|
||||||
|
TwitchVideoIE,
|
||||||
|
TwitchChapterIE,
|
||||||
|
TwitchVodIE,
|
||||||
|
TwitchProfileIE,
|
||||||
|
TwitchPastBroadcastsIE,
|
||||||
|
TwitchBookmarksIE,
|
||||||
|
TwitchStreamIE,
|
||||||
|
)
|
||||||
from .ubu import UbuIE
|
from .ubu import UbuIE
|
||||||
from .udemy import (
|
from .udemy import (
|
||||||
UdemyIE,
|
UdemyIE,
|
||||||
@ -445,6 +532,7 @@ from .videott import VideoTtIE
|
|||||||
from .videoweed import VideoWeedIE
|
from .videoweed import VideoWeedIE
|
||||||
from .vidme import VidmeIE
|
from .vidme import VidmeIE
|
||||||
from .vidzi import VidziIE
|
from .vidzi import VidziIE
|
||||||
|
from .vier import VierIE, VierVideosIE
|
||||||
from .vimeo import (
|
from .vimeo import (
|
||||||
VimeoIE,
|
VimeoIE,
|
||||||
VimeoAlbumIE,
|
VimeoAlbumIE,
|
||||||
@ -480,11 +568,13 @@ from .wdr import (
|
|||||||
WDRMobileIE,
|
WDRMobileIE,
|
||||||
WDRMausIE,
|
WDRMausIE,
|
||||||
)
|
)
|
||||||
|
from .webofstories import WebOfStoriesIE
|
||||||
from .weibo import WeiboIE
|
from .weibo import WeiboIE
|
||||||
from .wimp import WimpIE
|
from .wimp import WimpIE
|
||||||
from .wistia import WistiaIE
|
from .wistia import WistiaIE
|
||||||
from .worldstarhiphop import WorldStarHipHopIE
|
from .worldstarhiphop import WorldStarHipHopIE
|
||||||
from .wrzuta import WrzutaIE
|
from .wrzuta import WrzutaIE
|
||||||
|
from .wsj import WSJIE
|
||||||
from .xbef import XBefIE
|
from .xbef import XBefIE
|
||||||
from .xboxclips import XboxClipsIE
|
from .xboxclips import XboxClipsIE
|
||||||
from .xhamster import XHamsterIE
|
from .xhamster import XHamsterIE
|
||||||
@ -492,10 +582,14 @@ from .xminus import XMinusIE
|
|||||||
from .xnxx import XNXXIE
|
from .xnxx import XNXXIE
|
||||||
from .xvideos import XVideosIE
|
from .xvideos import XVideosIE
|
||||||
from .xtube import XTubeUserIE, XTubeIE
|
from .xtube import XTubeUserIE, XTubeIE
|
||||||
|
from .xuite import XuiteIE
|
||||||
|
from .xxxymovies import XXXYMoviesIE
|
||||||
from .yahoo import (
|
from .yahoo import (
|
||||||
YahooIE,
|
YahooIE,
|
||||||
YahooSearchIE,
|
YahooSearchIE,
|
||||||
)
|
)
|
||||||
|
from .yam import YamIE
|
||||||
|
from .yesjapan import YesJapanIE
|
||||||
from .ynet import YnetIE
|
from .ynet import YnetIE
|
||||||
from .youjizz import YouJizzIE
|
from .youjizz import YouJizzIE
|
||||||
from .youku import YoukuIE
|
from .youku import YoukuIE
|
||||||
@ -513,12 +607,12 @@ from .youtube import (
|
|||||||
YoutubeSearchURLIE,
|
YoutubeSearchURLIE,
|
||||||
YoutubeShowIE,
|
YoutubeShowIE,
|
||||||
YoutubeSubscriptionsIE,
|
YoutubeSubscriptionsIE,
|
||||||
YoutubeTopListIE,
|
YoutubeTruncatedIDIE,
|
||||||
YoutubeTruncatedURLIE,
|
YoutubeTruncatedURLIE,
|
||||||
YoutubeUserIE,
|
YoutubeUserIE,
|
||||||
YoutubeWatchLaterIE,
|
YoutubeWatchLaterIE,
|
||||||
)
|
)
|
||||||
from .zdf import ZDFIE
|
from .zdf import ZDFIE, ZDFChannelIE
|
||||||
from .zingmp3 import (
|
from .zingmp3 import (
|
||||||
ZingMp3SongIE,
|
ZingMp3SongIE,
|
||||||
ZingMp3AlbumIE,
|
ZingMp3AlbumIE,
|
||||||
@ -539,6 +633,17 @@ def gen_extractors():
|
|||||||
return [klass() for klass in _ALL_CLASSES]
|
return [klass() for klass in _ALL_CLASSES]
|
||||||
|
|
||||||
|
|
||||||
|
def list_extractors(age_limit):
|
||||||
|
"""
|
||||||
|
Return a list of extractors that are suitable for the given age,
|
||||||
|
sorted by extractor ID.
|
||||||
|
"""
|
||||||
|
|
||||||
|
return sorted(
|
||||||
|
filter(lambda ie: ie.is_suitable(age_limit), gen_extractors()),
|
||||||
|
key=lambda ie: ie.IE_NAME.lower())
|
||||||
|
|
||||||
|
|
||||||
def get_info_extractor(ie_name):
|
def get_info_extractor(ie_name):
|
||||||
"""Returns the info extractor class with the given ie_name"""
|
"""Returns the info extractor class with the given ie_name"""
|
||||||
return globals()[ie_name + 'IE']
|
return globals()[ie_name + 'IE']
|
||||||
|
68
youtube_dl/extractor/abc7news.py
Normal file
68
youtube_dl/extractor/abc7news.py
Normal file
@ -0,0 +1,68 @@
|
|||||||
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
import re
|
||||||
|
|
||||||
|
from .common import InfoExtractor
|
||||||
|
from ..utils import parse_iso8601
|
||||||
|
|
||||||
|
|
||||||
|
class Abc7NewsIE(InfoExtractor):
|
||||||
|
_VALID_URL = r'https?://abc7news\.com(?:/[^/]+/(?P<display_id>[^/]+))?/(?P<id>\d+)'
|
||||||
|
_TESTS = [
|
||||||
|
{
|
||||||
|
'url': 'http://abc7news.com/entertainment/east-bay-museum-celebrates-vintage-synthesizers/472581/',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '472581',
|
||||||
|
'display_id': 'east-bay-museum-celebrates-vintage-synthesizers',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'East Bay museum celebrates history of synthesized music',
|
||||||
|
'description': 'md5:a4f10fb2f2a02565c1749d4adbab4b10',
|
||||||
|
'thumbnail': 're:^https?://.*\.jpg$',
|
||||||
|
'timestamp': 1421123075,
|
||||||
|
'upload_date': '20150113',
|
||||||
|
'uploader': 'Jonathan Bloom',
|
||||||
|
},
|
||||||
|
'params': {
|
||||||
|
# m3u8 download
|
||||||
|
'skip_download': True,
|
||||||
|
},
|
||||||
|
},
|
||||||
|
{
|
||||||
|
'url': 'http://abc7news.com/472581',
|
||||||
|
'only_matching': True,
|
||||||
|
},
|
||||||
|
]
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
mobj = re.match(self._VALID_URL, url)
|
||||||
|
video_id = mobj.group('id')
|
||||||
|
display_id = mobj.group('display_id') or video_id
|
||||||
|
|
||||||
|
webpage = self._download_webpage(url, display_id)
|
||||||
|
|
||||||
|
m3u8 = self._html_search_meta(
|
||||||
|
'contentURL', webpage, 'm3u8 url', fatal=True)
|
||||||
|
|
||||||
|
formats = self._extract_m3u8_formats(m3u8, display_id, 'mp4')
|
||||||
|
self._sort_formats(formats)
|
||||||
|
|
||||||
|
title = self._og_search_title(webpage).strip()
|
||||||
|
description = self._og_search_description(webpage).strip()
|
||||||
|
thumbnail = self._og_search_thumbnail(webpage)
|
||||||
|
timestamp = parse_iso8601(self._search_regex(
|
||||||
|
r'<div class="meta">\s*<time class="timeago" datetime="([^"]+)">',
|
||||||
|
webpage, 'upload date', fatal=False))
|
||||||
|
uploader = self._search_regex(
|
||||||
|
r'rel="author">([^<]+)</a>',
|
||||||
|
webpage, 'uploader', default=None)
|
||||||
|
|
||||||
|
return {
|
||||||
|
'id': video_id,
|
||||||
|
'display_id': display_id,
|
||||||
|
'title': title,
|
||||||
|
'description': description,
|
||||||
|
'thumbnail': thumbnail,
|
||||||
|
'timestamp': timestamp,
|
||||||
|
'uploader': uploader,
|
||||||
|
'formats': formats,
|
||||||
|
}
|
70
youtube_dl/extractor/adobetv.py
Normal file
70
youtube_dl/extractor/adobetv.py
Normal file
@ -0,0 +1,70 @@
|
|||||||
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
from .common import InfoExtractor
|
||||||
|
from ..utils import (
|
||||||
|
parse_duration,
|
||||||
|
unified_strdate,
|
||||||
|
str_to_int,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
class AdobeTVIE(InfoExtractor):
|
||||||
|
_VALID_URL = r'https?://tv\.adobe\.com/watch/[^/]+/(?P<id>[^/]+)'
|
||||||
|
|
||||||
|
_TEST = {
|
||||||
|
'url': 'http://tv.adobe.com/watch/the-complete-picture-with-julieanne-kost/quick-tip-how-to-draw-a-circle-around-an-object-in-photoshop/',
|
||||||
|
'md5': '9bc5727bcdd55251f35ad311ca74fa1e',
|
||||||
|
'info_dict': {
|
||||||
|
'id': 'quick-tip-how-to-draw-a-circle-around-an-object-in-photoshop',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'Quick Tip - How to Draw a Circle Around an Object in Photoshop',
|
||||||
|
'description': 'md5:99ec318dc909d7ba2a1f2b038f7d2311',
|
||||||
|
'thumbnail': 're:https?://.*\.jpg$',
|
||||||
|
'upload_date': '20110914',
|
||||||
|
'duration': 60,
|
||||||
|
'view_count': int,
|
||||||
|
},
|
||||||
|
}
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
video_id = self._match_id(url)
|
||||||
|
|
||||||
|
webpage = self._download_webpage(url, video_id)
|
||||||
|
|
||||||
|
player = self._parse_json(
|
||||||
|
self._search_regex(r'html5player:\s*({.+?})\s*\n', webpage, 'player'),
|
||||||
|
video_id)
|
||||||
|
|
||||||
|
title = player.get('title') or self._search_regex(
|
||||||
|
r'data-title="([^"]+)"', webpage, 'title')
|
||||||
|
description = self._og_search_description(webpage)
|
||||||
|
thumbnail = self._og_search_thumbnail(webpage)
|
||||||
|
|
||||||
|
upload_date = unified_strdate(
|
||||||
|
self._html_search_meta('datepublished', webpage, 'upload date'))
|
||||||
|
|
||||||
|
duration = parse_duration(
|
||||||
|
self._html_search_meta('duration', webpage, 'duration')
|
||||||
|
or self._search_regex(r'Runtime:\s*(\d{2}:\d{2}:\d{2})', webpage, 'duration'))
|
||||||
|
|
||||||
|
view_count = str_to_int(self._search_regex(
|
||||||
|
r'<div class="views">\s*Views?:\s*([\d,.]+)\s*</div>',
|
||||||
|
webpage, 'view count'))
|
||||||
|
|
||||||
|
formats = [{
|
||||||
|
'url': source['src'],
|
||||||
|
'format_id': source.get('quality') or source['src'].split('-')[-1].split('.')[0] or None,
|
||||||
|
'tbr': source.get('bitrate'),
|
||||||
|
} for source in player['sources']]
|
||||||
|
self._sort_formats(formats)
|
||||||
|
|
||||||
|
return {
|
||||||
|
'id': video_id,
|
||||||
|
'title': title,
|
||||||
|
'description': description,
|
||||||
|
'thumbnail': thumbnail,
|
||||||
|
'upload_date': upload_date,
|
||||||
|
'duration': duration,
|
||||||
|
'view_count': view_count,
|
||||||
|
'formats': formats,
|
||||||
|
}
|
@ -2,123 +2,152 @@
|
|||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
import re
|
import re
|
||||||
|
import json
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
|
from ..utils import (
|
||||||
|
ExtractorError,
|
||||||
|
xpath_text,
|
||||||
|
float_or_none,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
class AdultSwimIE(InfoExtractor):
|
class AdultSwimIE(InfoExtractor):
|
||||||
_VALID_URL = r'https?://video\.adultswim\.com/(?P<path>.+?)(?:\.html)?(?:\?.*)?(?:#.*)?$'
|
_VALID_URL = r'https?://(?:www\.)?adultswim\.com/videos/(?P<is_playlist>playlists/)?(?P<show_path>[^/]+)/(?P<episode_path>[^/?#]+)/?'
|
||||||
_TEST = {
|
|
||||||
'url': 'http://video.adultswim.com/rick-and-morty/close-rick-counters-of-the-rick-kind.html?x=y#title',
|
_TESTS = [{
|
||||||
|
'url': 'http://adultswim.com/videos/rick-and-morty/pilot',
|
||||||
'playlist': [
|
'playlist': [
|
||||||
{
|
{
|
||||||
'md5': '4da359ec73b58df4575cd01a610ba5dc',
|
'md5': '247572debc75c7652f253c8daa51a14d',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '8a250ba1450996e901453d7f02ca02f5',
|
'id': 'rQxZvXQ4ROaSOqq-or2Mow-0',
|
||||||
'ext': 'flv',
|
'ext': 'flv',
|
||||||
'title': 'Rick and Morty Close Rick-Counters of the Rick Kind part 1',
|
'title': 'Rick and Morty - Pilot Part 1',
|
||||||
'description': 'Rick has a run in with some old associates, resulting in a fallout with Morty. You got any chips, broh?',
|
'description': "Rick moves in with his daughter's family and establishes himself as a bad influence on his grandson, Morty. "
|
||||||
'uploader': 'Rick and Morty',
|
},
|
||||||
'thumbnail': 'http://i.cdn.turner.com/asfix/repository/8a250ba13f865824013fc9db8b6b0400/thumbnail_267549017116827057.jpg'
|
|
||||||
}
|
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
'md5': 'ffbdf55af9331c509d95350bd0cc1819',
|
'md5': '77b0e037a4b20ec6b98671c4c379f48d',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '8a250ba1450996e901453d7f4bd102f6',
|
'id': 'rQxZvXQ4ROaSOqq-or2Mow-3',
|
||||||
'ext': 'flv',
|
'ext': 'flv',
|
||||||
'title': 'Rick and Morty Close Rick-Counters of the Rick Kind part 2',
|
'title': 'Rick and Morty - Pilot Part 4',
|
||||||
'description': 'Rick has a run in with some old associates, resulting in a fallout with Morty. You got any chips, broh?',
|
'description': "Rick moves in with his daughter's family and establishes himself as a bad influence on his grandson, Morty. "
|
||||||
'uploader': 'Rick and Morty',
|
},
|
||||||
'thumbnail': 'http://i.cdn.turner.com/asfix/repository/8a250ba13f865824013fc9db8b6b0400/thumbnail_267549017116827057.jpg'
|
|
||||||
}
|
|
||||||
},
|
},
|
||||||
|
],
|
||||||
|
'info_dict': {
|
||||||
|
'id': 'rQxZvXQ4ROaSOqq-or2Mow',
|
||||||
|
'title': 'Rick and Morty - Pilot',
|
||||||
|
'description': "Rick moves in with his daughter's family and establishes himself as a bad influence on his grandson, Morty. "
|
||||||
|
}
|
||||||
|
}, {
|
||||||
|
'url': 'http://www.adultswim.com/videos/playlists/american-parenting/putting-francine-out-of-business/',
|
||||||
|
'playlist': [
|
||||||
{
|
{
|
||||||
'md5': 'b92409635540304280b4b6c36bd14a0a',
|
'md5': '2eb5c06d0f9a1539da3718d897f13ec5',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '8a250ba1450996e901453d7fa73c02f7',
|
'id': '-t8CamQlQ2aYZ49ItZCFog-0',
|
||||||
'ext': 'flv',
|
'ext': 'flv',
|
||||||
'title': 'Rick and Morty Close Rick-Counters of the Rick Kind part 3',
|
'title': 'American Dad - Putting Francine Out of Business',
|
||||||
'description': 'Rick has a run in with some old associates, resulting in a fallout with Morty. You got any chips, broh?',
|
'description': 'Stan hatches a plan to get Francine out of the real estate business.Watch more American Dad on [adult swim].'
|
||||||
'uploader': 'Rick and Morty',
|
},
|
||||||
'thumbnail': 'http://i.cdn.turner.com/asfix/repository/8a250ba13f865824013fc9db8b6b0400/thumbnail_267549017116827057.jpg'
|
|
||||||
}
|
|
||||||
},
|
|
||||||
{
|
|
||||||
'md5': 'e8818891d60e47b29cd89d7b0278156d',
|
|
||||||
'info_dict': {
|
|
||||||
'id': '8a250ba1450996e901453d7fc8ba02f8',
|
|
||||||
'ext': 'flv',
|
|
||||||
'title': 'Rick and Morty Close Rick-Counters of the Rick Kind part 4',
|
|
||||||
'description': 'Rick has a run in with some old associates, resulting in a fallout with Morty. You got any chips, broh?',
|
|
||||||
'uploader': 'Rick and Morty',
|
|
||||||
'thumbnail': 'http://i.cdn.turner.com/asfix/repository/8a250ba13f865824013fc9db8b6b0400/thumbnail_267549017116827057.jpg'
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
]
|
],
|
||||||
}
|
'info_dict': {
|
||||||
|
'id': '-t8CamQlQ2aYZ49ItZCFog',
|
||||||
|
'title': 'American Dad - Putting Francine Out of Business',
|
||||||
|
'description': 'Stan hatches a plan to get Francine out of the real estate business.Watch more American Dad on [adult swim].'
|
||||||
|
},
|
||||||
|
}]
|
||||||
|
|
||||||
_video_extensions = {
|
@staticmethod
|
||||||
'3500': 'flv',
|
def find_video_info(collection, slug):
|
||||||
'640': 'mp4',
|
for video in collection.get('videos'):
|
||||||
'150': 'mp4',
|
if video.get('slug') == slug:
|
||||||
'ipad': 'm3u8',
|
return video
|
||||||
'iphone': 'm3u8'
|
|
||||||
}
|
@staticmethod
|
||||||
_video_dimensions = {
|
def find_collection_by_linkURL(collections, linkURL):
|
||||||
'3500': (1280, 720),
|
for collection in collections:
|
||||||
'640': (480, 270),
|
if collection.get('linkURL') == linkURL:
|
||||||
'150': (320, 180)
|
return collection
|
||||||
}
|
|
||||||
|
@staticmethod
|
||||||
|
def find_collection_containing_video(collections, slug):
|
||||||
|
for collection in collections:
|
||||||
|
for video in collection.get('videos'):
|
||||||
|
if video.get('slug') == slug:
|
||||||
|
return collection, video
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
mobj = re.match(self._VALID_URL, url)
|
mobj = re.match(self._VALID_URL, url)
|
||||||
video_path = mobj.group('path')
|
show_path = mobj.group('show_path')
|
||||||
|
episode_path = mobj.group('episode_path')
|
||||||
|
is_playlist = True if mobj.group('is_playlist') else False
|
||||||
|
|
||||||
webpage = self._download_webpage(url, video_path)
|
webpage = self._download_webpage(url, episode_path)
|
||||||
episode_id = self._html_search_regex(
|
|
||||||
r'<link rel="video_src" href="http://i\.adultswim\.com/adultswim/adultswimtv/tools/swf/viralplayer.swf\?id=([0-9a-f]+?)"\s*/?\s*>',
|
|
||||||
webpage, 'episode_id')
|
|
||||||
title = self._og_search_title(webpage)
|
|
||||||
|
|
||||||
index_url = 'http://asfix.adultswim.com/asfix-svc/episodeSearch/getEpisodesByIDs?networkName=AS&ids=%s' % episode_id
|
# Extract the value of `bootstrappedData` from the Javascript in the page.
|
||||||
idoc = self._download_xml(index_url, title, 'Downloading episode index', 'Unable to download episode index')
|
bootstrappedDataJS = self._search_regex(r'var bootstrappedData = ({.*});', webpage, episode_path)
|
||||||
|
|
||||||
episode_el = idoc.find('.//episode')
|
try:
|
||||||
show_title = episode_el.attrib.get('collectionTitle')
|
bootstrappedData = json.loads(bootstrappedDataJS)
|
||||||
episode_title = episode_el.attrib.get('title')
|
except ValueError as ve:
|
||||||
thumbnail = episode_el.attrib.get('thumbnailUrl')
|
errmsg = '%s: Failed to parse JSON ' % episode_path
|
||||||
description = episode_el.find('./description').text.strip()
|
raise ExtractorError(errmsg, cause=ve)
|
||||||
|
|
||||||
|
# Downloading videos from a /videos/playlist/ URL needs to be handled differently.
|
||||||
|
# NOTE: We are only downloading one video (the current one) not the playlist
|
||||||
|
if is_playlist:
|
||||||
|
collections = bootstrappedData['playlists']['collections']
|
||||||
|
collection = self.find_collection_by_linkURL(collections, show_path)
|
||||||
|
video_info = self.find_video_info(collection, episode_path)
|
||||||
|
|
||||||
|
show_title = video_info['showTitle']
|
||||||
|
segment_ids = [video_info['videoPlaybackID']]
|
||||||
|
else:
|
||||||
|
collections = bootstrappedData['show']['collections']
|
||||||
|
collection, video_info = self.find_collection_containing_video(collections, episode_path)
|
||||||
|
|
||||||
|
show = bootstrappedData['show']
|
||||||
|
show_title = show['title']
|
||||||
|
segment_ids = [clip['videoPlaybackID'] for clip in video_info['clips']]
|
||||||
|
|
||||||
|
episode_id = video_info['id']
|
||||||
|
episode_title = video_info['title']
|
||||||
|
episode_description = video_info['description']
|
||||||
|
episode_duration = video_info.get('duration')
|
||||||
|
|
||||||
entries = []
|
entries = []
|
||||||
segment_els = episode_el.findall('./segments/segment')
|
for part_num, segment_id in enumerate(segment_ids):
|
||||||
|
segment_url = 'http://www.adultswim.com/videos/api/v0/assets?id=%s&platform=mobile' % segment_id
|
||||||
|
|
||||||
for part_num, segment_el in enumerate(segment_els):
|
segment_title = '%s - %s' % (show_title, episode_title)
|
||||||
segment_id = segment_el.attrib.get('id')
|
if len(segment_ids) > 1:
|
||||||
segment_title = '%s %s part %d' % (show_title, episode_title, part_num + 1)
|
segment_title += ' Part %d' % (part_num + 1)
|
||||||
thumbnail = segment_el.attrib.get('thumbnailUrl')
|
|
||||||
duration = segment_el.attrib.get('duration')
|
|
||||||
|
|
||||||
segment_url = 'http://asfix.adultswim.com/asfix-svc/episodeservices/getCvpPlaylist?networkName=AS&id=%s' % segment_id
|
|
||||||
idoc = self._download_xml(
|
idoc = self._download_xml(
|
||||||
segment_url, segment_title,
|
segment_url, segment_title,
|
||||||
'Downloading segment information', 'Unable to download segment information')
|
'Downloading segment information', 'Unable to download segment information')
|
||||||
|
|
||||||
|
segment_duration = float_or_none(
|
||||||
|
xpath_text(idoc, './/trt', 'segment duration').strip())
|
||||||
|
|
||||||
formats = []
|
formats = []
|
||||||
file_els = idoc.findall('.//files/file')
|
file_els = idoc.findall('.//files/file')
|
||||||
|
|
||||||
for file_el in file_els:
|
for file_el in file_els:
|
||||||
bitrate = file_el.attrib.get('bitrate')
|
bitrate = file_el.attrib.get('bitrate')
|
||||||
type = file_el.attrib.get('type')
|
ftype = file_el.attrib.get('type')
|
||||||
width, height = self._video_dimensions.get(bitrate, (None, None))
|
|
||||||
formats.append({
|
formats.append({
|
||||||
'format_id': '%s-%s' % (bitrate, type),
|
'format_id': '%s_%s' % (bitrate, ftype),
|
||||||
'url': file_el.text,
|
'url': file_el.text.strip(),
|
||||||
'ext': self._video_extensions.get(bitrate, 'mp4'),
|
|
||||||
# The bitrate may not be a number (for example: 'iphone')
|
# The bitrate may not be a number (for example: 'iphone')
|
||||||
'tbr': int(bitrate) if bitrate.isdigit() else None,
|
'tbr': int(bitrate) if bitrate.isdigit() else None,
|
||||||
'height': height,
|
'quality': 1 if ftype == 'hd' else -1
|
||||||
'width': width
|
|
||||||
})
|
})
|
||||||
|
|
||||||
self._sort_formats(formats)
|
self._sort_formats(formats)
|
||||||
@ -127,18 +156,16 @@ class AdultSwimIE(InfoExtractor):
|
|||||||
'id': segment_id,
|
'id': segment_id,
|
||||||
'title': segment_title,
|
'title': segment_title,
|
||||||
'formats': formats,
|
'formats': formats,
|
||||||
'uploader': show_title,
|
'duration': segment_duration,
|
||||||
'thumbnail': thumbnail,
|
'description': episode_description
|
||||||
'duration': duration,
|
|
||||||
'description': description
|
|
||||||
})
|
})
|
||||||
|
|
||||||
return {
|
return {
|
||||||
'_type': 'playlist',
|
'_type': 'playlist',
|
||||||
'id': episode_id,
|
'id': episode_id,
|
||||||
'display_id': video_path,
|
'display_id': episode_path,
|
||||||
'entries': entries,
|
'entries': entries,
|
||||||
'title': '%s %s' % (show_title, episode_title),
|
'title': '%s - %s' % (show_title, episode_title),
|
||||||
'description': description,
|
'description': episode_description,
|
||||||
'thumbnail': thumbnail
|
'duration': episode_duration
|
||||||
}
|
}
|
||||||
|
103
youtube_dl/extractor/aftenposten.py
Normal file
103
youtube_dl/extractor/aftenposten.py
Normal file
@ -0,0 +1,103 @@
|
|||||||
|
# coding: utf-8
|
||||||
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
import re
|
||||||
|
|
||||||
|
from .common import InfoExtractor
|
||||||
|
from ..utils import (
|
||||||
|
int_or_none,
|
||||||
|
parse_iso8601,
|
||||||
|
xpath_with_ns,
|
||||||
|
xpath_text,
|
||||||
|
find_xpath_attr,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
class AftenpostenIE(InfoExtractor):
|
||||||
|
_VALID_URL = r'https?://(?:www\.)?aftenposten\.no/webtv/([^/]+/)*(?P<id>[^/]+)-\d+\.html'
|
||||||
|
|
||||||
|
_TEST = {
|
||||||
|
'url': 'http://www.aftenposten.no/webtv/serier-og-programmer/sweatshopenglish/TRAILER-SWEATSHOP---I-cant-take-any-more-7800835.html?paging=§ion=webtv_serierogprogrammer_sweatshop_sweatshopenglish',
|
||||||
|
'md5': 'fd828cd29774a729bf4d4425fe192972',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '21039',
|
||||||
|
'ext': 'mov',
|
||||||
|
'title': 'TRAILER: "Sweatshop" - I can´t take any more',
|
||||||
|
'description': 'md5:21891f2b0dd7ec2f78d84a50e54f8238',
|
||||||
|
'timestamp': 1416927969,
|
||||||
|
'upload_date': '20141125',
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
display_id = self._match_id(url)
|
||||||
|
|
||||||
|
webpage = self._download_webpage(url, display_id)
|
||||||
|
|
||||||
|
video_id = self._html_search_regex(
|
||||||
|
r'data-xs-id="(\d+)"', webpage, 'video id')
|
||||||
|
|
||||||
|
data = self._download_xml(
|
||||||
|
'http://frontend.xstream.dk/ap/feed/video/?platform=web&id=%s' % video_id, video_id)
|
||||||
|
|
||||||
|
NS_MAP = {
|
||||||
|
'atom': 'http://www.w3.org/2005/Atom',
|
||||||
|
'xt': 'http://xstream.dk/',
|
||||||
|
'media': 'http://search.yahoo.com/mrss/',
|
||||||
|
}
|
||||||
|
|
||||||
|
entry = data.find(xpath_with_ns('./atom:entry', NS_MAP))
|
||||||
|
|
||||||
|
title = xpath_text(
|
||||||
|
entry, xpath_with_ns('./atom:title', NS_MAP), 'title')
|
||||||
|
description = xpath_text(
|
||||||
|
entry, xpath_with_ns('./atom:summary', NS_MAP), 'description')
|
||||||
|
timestamp = parse_iso8601(xpath_text(
|
||||||
|
entry, xpath_with_ns('./atom:published', NS_MAP), 'upload date'))
|
||||||
|
|
||||||
|
formats = []
|
||||||
|
media_group = entry.find(xpath_with_ns('./media:group', NS_MAP))
|
||||||
|
for media_content in media_group.findall(xpath_with_ns('./media:content', NS_MAP)):
|
||||||
|
media_url = media_content.get('url')
|
||||||
|
if not media_url:
|
||||||
|
continue
|
||||||
|
tbr = int_or_none(media_content.get('bitrate'))
|
||||||
|
mobj = re.search(r'^(?P<url>rtmp://[^/]+/(?P<app>[^/]+))/(?P<playpath>.+)$', media_url)
|
||||||
|
if mobj:
|
||||||
|
formats.append({
|
||||||
|
'url': mobj.group('url'),
|
||||||
|
'play_path': 'mp4:%s' % mobj.group('playpath'),
|
||||||
|
'app': mobj.group('app'),
|
||||||
|
'ext': 'flv',
|
||||||
|
'tbr': tbr,
|
||||||
|
'format_id': 'rtmp-%d' % tbr,
|
||||||
|
})
|
||||||
|
else:
|
||||||
|
formats.append({
|
||||||
|
'url': media_url,
|
||||||
|
'tbr': tbr,
|
||||||
|
})
|
||||||
|
self._sort_formats(formats)
|
||||||
|
|
||||||
|
link = find_xpath_attr(
|
||||||
|
entry, xpath_with_ns('./atom:link', NS_MAP), 'rel', 'original')
|
||||||
|
if link is not None:
|
||||||
|
formats.append({
|
||||||
|
'url': link.get('href'),
|
||||||
|
'format_id': link.get('rel'),
|
||||||
|
})
|
||||||
|
|
||||||
|
thumbnails = [{
|
||||||
|
'url': splash.get('url'),
|
||||||
|
'width': int_or_none(splash.get('width')),
|
||||||
|
'height': int_or_none(splash.get('height')),
|
||||||
|
} for splash in media_group.findall(xpath_with_ns('./xt:splash', NS_MAP))]
|
||||||
|
|
||||||
|
return {
|
||||||
|
'id': video_id,
|
||||||
|
'title': title,
|
||||||
|
'description': description,
|
||||||
|
'timestamp': timestamp,
|
||||||
|
'formats': formats,
|
||||||
|
'thumbnails': thumbnails,
|
||||||
|
}
|
@ -1,8 +1,6 @@
|
|||||||
# encoding: utf-8
|
# encoding: utf-8
|
||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
import re
|
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
|
|
||||||
|
|
||||||
@ -21,9 +19,7 @@ class AftonbladetIE(InfoExtractor):
|
|||||||
}
|
}
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
mobj = re.search(self._VALID_URL, url)
|
video_id = self._match_id(url)
|
||||||
|
|
||||||
video_id = mobj.group('video_id')
|
|
||||||
webpage = self._download_webpage(url, video_id)
|
webpage = self._download_webpage(url, video_id)
|
||||||
|
|
||||||
# find internal video meta data
|
# find internal video meta data
|
||||||
|
35
youtube_dl/extractor/aljazeera.py
Normal file
35
youtube_dl/extractor/aljazeera.py
Normal file
@ -0,0 +1,35 @@
|
|||||||
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
from .common import InfoExtractor
|
||||||
|
|
||||||
|
|
||||||
|
class AlJazeeraIE(InfoExtractor):
|
||||||
|
_VALID_URL = r'http://www\.aljazeera\.com/programmes/.*?/(?P<id>[^/]+)\.html'
|
||||||
|
|
||||||
|
_TEST = {
|
||||||
|
'url': 'http://www.aljazeera.com/programmes/the-slum/2014/08/deliverance-201482883754237240.html',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '3792260579001',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'The Slum - Episode 1: Deliverance',
|
||||||
|
'description': 'As a birth attendant advocating for family planning, Remy is on the frontline of Tondo\'s battle with overcrowding.',
|
||||||
|
'uploader': 'Al Jazeera English',
|
||||||
|
},
|
||||||
|
'add_ie': ['Brightcove'],
|
||||||
|
}
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
program_name = self._match_id(url)
|
||||||
|
webpage = self._download_webpage(url, program_name)
|
||||||
|
brightcove_id = self._search_regex(
|
||||||
|
r'RenderPagesVideo\(\'(.+?)\'', webpage, 'brightcove id')
|
||||||
|
|
||||||
|
return {
|
||||||
|
'_type': 'url',
|
||||||
|
'url': (
|
||||||
|
'brightcove:'
|
||||||
|
'playerKey=AQ~~%2CAAAAmtVJIFk~%2CTVGOQ5ZTwJbeMWnq5d_H4MOM57xfzApc'
|
||||||
|
'&%40videoPlayer={0}'.format(brightcove_id)
|
||||||
|
),
|
||||||
|
'ie_key': 'Brightcove',
|
||||||
|
}
|
@ -5,15 +5,14 @@ import re
|
|||||||
import json
|
import json
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
|
from ..compat import compat_str
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
compat_str,
|
|
||||||
qualities,
|
qualities,
|
||||||
determine_ext,
|
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
class AllocineIE(InfoExtractor):
|
class AllocineIE(InfoExtractor):
|
||||||
_VALID_URL = r'https?://(?:www\.)?allocine\.fr/(?P<typ>article|video|film)/(fichearticle_gen_carticle=|player_gen_cmedia=|fichefilm_gen_cfilm=)(?P<id>[0-9]+)(?:\.html)?'
|
_VALID_URL = r'https?://(?:www\.)?allocine\.fr/(?P<typ>article|video|film)/(fichearticle_gen_carticle=|player_gen_cmedia=|fichefilm_gen_cfilm=|video-)(?P<id>[0-9]+)(?:\.html)?'
|
||||||
|
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'http://www.allocine.fr/article/fichearticle_gen_carticle=18635087.html',
|
'url': 'http://www.allocine.fr/article/fichearticle_gen_carticle=18635087.html',
|
||||||
@ -45,6 +44,9 @@ class AllocineIE(InfoExtractor):
|
|||||||
'description': 'md5:71742e3a74b0d692c7fce0dd2017a4ac',
|
'description': 'md5:71742e3a74b0d692c7fce0dd2017a4ac',
|
||||||
'thumbnail': 're:http://.*\.jpg',
|
'thumbnail': 're:http://.*\.jpg',
|
||||||
},
|
},
|
||||||
|
}, {
|
||||||
|
'url': 'http://www.allocine.fr/video/video-19550147/',
|
||||||
|
'only_matching': True,
|
||||||
}]
|
}]
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
@ -75,9 +77,7 @@ class AllocineIE(InfoExtractor):
|
|||||||
'format_id': format_id,
|
'format_id': format_id,
|
||||||
'quality': quality(format_id),
|
'quality': quality(format_id),
|
||||||
'url': v,
|
'url': v,
|
||||||
'ext': determine_ext(v),
|
|
||||||
})
|
})
|
||||||
|
|
||||||
self._sort_formats(formats)
|
self._sort_formats(formats)
|
||||||
|
|
||||||
return {
|
return {
|
||||||
|
77
youtube_dl/extractor/alphaporno.py
Normal file
77
youtube_dl/extractor/alphaporno.py
Normal file
@ -0,0 +1,77 @@
|
|||||||
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
from .common import InfoExtractor
|
||||||
|
from ..utils import (
|
||||||
|
parse_iso8601,
|
||||||
|
parse_duration,
|
||||||
|
parse_filesize,
|
||||||
|
int_or_none,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
class AlphaPornoIE(InfoExtractor):
|
||||||
|
_VALID_URL = r'https?://(?:www\.)?alphaporno\.com/videos/(?P<id>[^/]+)'
|
||||||
|
_TEST = {
|
||||||
|
'url': 'http://www.alphaporno.com/videos/sensual-striptease-porn-with-samantha-alexandra/',
|
||||||
|
'md5': 'feb6d3bba8848cd54467a87ad34bd38e',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '258807',
|
||||||
|
'display_id': 'sensual-striptease-porn-with-samantha-alexandra',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'Sensual striptease porn with Samantha Alexandra',
|
||||||
|
'thumbnail': 're:https?://.*\.jpg$',
|
||||||
|
'timestamp': 1418694611,
|
||||||
|
'upload_date': '20141216',
|
||||||
|
'duration': 387,
|
||||||
|
'filesize_approx': 54120000,
|
||||||
|
'tbr': 1145,
|
||||||
|
'categories': list,
|
||||||
|
'age_limit': 18,
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
display_id = self._match_id(url)
|
||||||
|
|
||||||
|
webpage = self._download_webpage(url, display_id)
|
||||||
|
|
||||||
|
video_id = self._search_regex(
|
||||||
|
r"video_id\s*:\s*'([^']+)'", webpage, 'video id', default=None)
|
||||||
|
|
||||||
|
video_url = self._search_regex(
|
||||||
|
r"video_url\s*:\s*'([^']+)'", webpage, 'video url')
|
||||||
|
ext = self._html_search_meta(
|
||||||
|
'encodingFormat', webpage, 'ext', default='.mp4')[1:]
|
||||||
|
|
||||||
|
title = self._search_regex(
|
||||||
|
[r'<meta content="([^"]+)" itemprop="description">',
|
||||||
|
r'class="title" itemprop="name">([^<]+)<'],
|
||||||
|
webpage, 'title')
|
||||||
|
thumbnail = self._html_search_meta('thumbnail', webpage, 'thumbnail')
|
||||||
|
timestamp = parse_iso8601(self._html_search_meta(
|
||||||
|
'uploadDate', webpage, 'upload date'))
|
||||||
|
duration = parse_duration(self._html_search_meta(
|
||||||
|
'duration', webpage, 'duration'))
|
||||||
|
filesize_approx = parse_filesize(self._html_search_meta(
|
||||||
|
'contentSize', webpage, 'file size'))
|
||||||
|
bitrate = int_or_none(self._html_search_meta(
|
||||||
|
'bitrate', webpage, 'bitrate'))
|
||||||
|
categories = self._html_search_meta(
|
||||||
|
'keywords', webpage, 'categories', default='').split(',')
|
||||||
|
|
||||||
|
age_limit = self._rta_search(webpage)
|
||||||
|
|
||||||
|
return {
|
||||||
|
'id': video_id,
|
||||||
|
'display_id': display_id,
|
||||||
|
'url': video_url,
|
||||||
|
'ext': ext,
|
||||||
|
'title': title,
|
||||||
|
'thumbnail': thumbnail,
|
||||||
|
'timestamp': timestamp,
|
||||||
|
'duration': duration,
|
||||||
|
'filesize_approx': filesize_approx,
|
||||||
|
'tbr': bitrate,
|
||||||
|
'categories': categories,
|
||||||
|
'age_limit': age_limit,
|
||||||
|
}
|
@ -3,7 +3,6 @@ from __future__ import unicode_literals
|
|||||||
import re
|
import re
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from .fivemin import FiveMinIE
|
|
||||||
|
|
||||||
|
|
||||||
class AolIE(InfoExtractor):
|
class AolIE(InfoExtractor):
|
||||||
@ -42,31 +41,30 @@ class AolIE(InfoExtractor):
|
|||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
mobj = re.match(self._VALID_URL, url)
|
mobj = re.match(self._VALID_URL, url)
|
||||||
video_id = mobj.group('id')
|
video_id = mobj.group('id')
|
||||||
|
|
||||||
playlist_id = mobj.group('playlist_id')
|
playlist_id = mobj.group('playlist_id')
|
||||||
if playlist_id and not self._downloader.params.get('noplaylist'):
|
if not playlist_id or self._downloader.params.get('noplaylist'):
|
||||||
self.to_screen('Downloading playlist %s - add --no-playlist to just download video %s' % (playlist_id, video_id))
|
return self.url_result('5min:%s' % video_id)
|
||||||
|
|
||||||
webpage = self._download_webpage(url, playlist_id)
|
self.to_screen('Downloading playlist %s - add --no-playlist to just download video %s' % (playlist_id, video_id))
|
||||||
title = self._html_search_regex(
|
|
||||||
r'<h1 class="video-title[^"]*">(.+?)</h1>', webpage, 'title')
|
|
||||||
playlist_html = self._search_regex(
|
|
||||||
r"(?s)<ul\s+class='video-related[^']*'>(.*?)</ul>", webpage,
|
|
||||||
'playlist HTML')
|
|
||||||
entries = [{
|
|
||||||
'_type': 'url',
|
|
||||||
'url': 'aol-video:%s' % m.group('id'),
|
|
||||||
'ie_key': 'Aol',
|
|
||||||
} for m in re.finditer(
|
|
||||||
r"<a\s+href='.*videoid=(?P<id>[0-9]+)'\s+class='video-thumb'>",
|
|
||||||
playlist_html)]
|
|
||||||
|
|
||||||
return {
|
webpage = self._download_webpage(url, playlist_id)
|
||||||
'_type': 'playlist',
|
title = self._html_search_regex(
|
||||||
'id': playlist_id,
|
r'<h1 class="video-title[^"]*">(.+?)</h1>', webpage, 'title')
|
||||||
'display_id': mobj.group('playlist_display_id'),
|
playlist_html = self._search_regex(
|
||||||
'title': title,
|
r"(?s)<ul\s+class='video-related[^']*'>(.*?)</ul>", webpage,
|
||||||
'entries': entries,
|
'playlist HTML')
|
||||||
}
|
entries = [{
|
||||||
|
'_type': 'url',
|
||||||
|
'url': 'aol-video:%s' % m.group('id'),
|
||||||
|
'ie_key': 'Aol',
|
||||||
|
} for m in re.finditer(
|
||||||
|
r"<a\s+href='.*videoid=(?P<id>[0-9]+)'\s+class='video-thumb'>",
|
||||||
|
playlist_html)]
|
||||||
|
|
||||||
return FiveMinIE._build_result(video_id)
|
return {
|
||||||
|
'_type': 'playlist',
|
||||||
|
'id': playlist_id,
|
||||||
|
'display_id': mobj.group('playlist_display_id'),
|
||||||
|
'title': title,
|
||||||
|
'entries': entries,
|
||||||
|
}
|
||||||
|
@ -20,6 +20,7 @@ class AparatIE(InfoExtractor):
|
|||||||
'id': 'wP8On',
|
'id': 'wP8On',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': 'تیم گلکسی 11 - زومیت',
|
'title': 'تیم گلکسی 11 - زومیت',
|
||||||
|
'age_limit': 0,
|
||||||
},
|
},
|
||||||
# 'skip': 'Extremely unreliable',
|
# 'skip': 'Extremely unreliable',
|
||||||
}
|
}
|
||||||
@ -34,7 +35,8 @@ class AparatIE(InfoExtractor):
|
|||||||
video_id + '/vt/frame')
|
video_id + '/vt/frame')
|
||||||
webpage = self._download_webpage(embed_url, video_id)
|
webpage = self._download_webpage(embed_url, video_id)
|
||||||
|
|
||||||
video_urls = re.findall(r'fileList\[[0-9]+\]\s*=\s*"([^"]+)"', webpage)
|
video_urls = [video_url.replace('\\/', '/') for video_url in re.findall(
|
||||||
|
r'(?:fileList\[[0-9]+\]\s*=|"file"\s*:)\s*"([^"]+)"', webpage)]
|
||||||
for i, video_url in enumerate(video_urls):
|
for i, video_url in enumerate(video_urls):
|
||||||
req = HEADRequest(video_url)
|
req = HEADRequest(video_url)
|
||||||
res = self._request_webpage(
|
res = self._request_webpage(
|
||||||
@ -46,7 +48,7 @@ class AparatIE(InfoExtractor):
|
|||||||
|
|
||||||
title = self._search_regex(r'\s+title:\s*"([^"]+)"', webpage, 'title')
|
title = self._search_regex(r'\s+title:\s*"([^"]+)"', webpage, 'title')
|
||||||
thumbnail = self._search_regex(
|
thumbnail = self._search_regex(
|
||||||
r'\s+image:\s*"([^"]+)"', webpage, 'thumbnail', fatal=False)
|
r'image:\s*"([^"]+)"', webpage, 'thumbnail', fatal=False)
|
||||||
|
|
||||||
return {
|
return {
|
||||||
'id': video_id,
|
'id': video_id,
|
||||||
@ -54,4 +56,5 @@ class AparatIE(InfoExtractor):
|
|||||||
'url': video_url,
|
'url': video_url,
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'thumbnail': thumbnail,
|
'thumbnail': thumbnail,
|
||||||
|
'age_limit': self._family_friendly_search(webpage),
|
||||||
}
|
}
|
||||||
|
@ -4,8 +4,8 @@ import re
|
|||||||
import json
|
import json
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
|
from ..compat import compat_urlparse
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
compat_urlparse,
|
|
||||||
int_or_none,
|
int_or_none,
|
||||||
)
|
)
|
||||||
|
|
||||||
@ -14,6 +14,9 @@ class AppleTrailersIE(InfoExtractor):
|
|||||||
_VALID_URL = r'https?://(?:www\.)?trailers\.apple\.com/trailers/(?P<company>[^/]+)/(?P<movie>[^/]+)'
|
_VALID_URL = r'https?://(?:www\.)?trailers\.apple\.com/trailers/(?P<company>[^/]+)/(?P<movie>[^/]+)'
|
||||||
_TEST = {
|
_TEST = {
|
||||||
"url": "http://trailers.apple.com/trailers/wb/manofsteel/",
|
"url": "http://trailers.apple.com/trailers/wb/manofsteel/",
|
||||||
|
'info_dict': {
|
||||||
|
'id': 'manofsteel',
|
||||||
|
},
|
||||||
"playlist": [
|
"playlist": [
|
||||||
{
|
{
|
||||||
"md5": "d97a8e575432dbcb81b7c3acb741f8a8",
|
"md5": "d97a8e575432dbcb81b7c3acb741f8a8",
|
||||||
@ -122,14 +125,15 @@ class AppleTrailersIE(InfoExtractor):
|
|||||||
playlist.append({
|
playlist.append({
|
||||||
'_type': 'video',
|
'_type': 'video',
|
||||||
'id': video_id,
|
'id': video_id,
|
||||||
'title': title,
|
|
||||||
'formats': formats,
|
'formats': formats,
|
||||||
'title': title,
|
'title': title,
|
||||||
'duration': duration,
|
'duration': duration,
|
||||||
'thumbnail': thumbnail,
|
'thumbnail': thumbnail,
|
||||||
'upload_date': upload_date,
|
'upload_date': upload_date,
|
||||||
'uploader_id': uploader_id,
|
'uploader_id': uploader_id,
|
||||||
'user_agent': 'QuickTime compatible (youtube-dl)',
|
'http_headers': {
|
||||||
|
'User-Agent': 'QuickTime compatible (youtube-dl)',
|
||||||
|
},
|
||||||
})
|
})
|
||||||
|
|
||||||
return {
|
return {
|
||||||
|
@ -1,42 +1,48 @@
|
|||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
import json
|
|
||||||
import re
|
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from ..utils import (
|
from ..utils import unified_strdate
|
||||||
unified_strdate,
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
class ArchiveOrgIE(InfoExtractor):
|
class ArchiveOrgIE(InfoExtractor):
|
||||||
IE_NAME = 'archive.org'
|
IE_NAME = 'archive.org'
|
||||||
IE_DESC = 'archive.org videos'
|
IE_DESC = 'archive.org videos'
|
||||||
_VALID_URL = r'(?:https?://)?(?:www\.)?archive\.org/details/(?P<id>[^?/]+)(?:[?].*)?$'
|
_VALID_URL = r'https?://(?:www\.)?archive\.org/details/(?P<id>[^?/]+)(?:[?].*)?$'
|
||||||
_TEST = {
|
_TESTS = [{
|
||||||
"url": "http://archive.org/details/XD300-23_68HighlightsAResearchCntAugHumanIntellect",
|
'url': 'http://archive.org/details/XD300-23_68HighlightsAResearchCntAugHumanIntellect',
|
||||||
'file': 'XD300-23_68HighlightsAResearchCntAugHumanIntellect.ogv',
|
|
||||||
'md5': '8af1d4cf447933ed3c7f4871162602db',
|
'md5': '8af1d4cf447933ed3c7f4871162602db',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
"title": "1968 Demo - FJCC Conference Presentation Reel #1",
|
'id': 'XD300-23_68HighlightsAResearchCntAugHumanIntellect',
|
||||||
"description": "Reel 1 of 3: Also known as the \"Mother of All Demos\", Doug Engelbart's presentation at the Fall Joint Computer Conference in San Francisco, December 9, 1968 titled \"A Research Center for Augmenting Human Intellect.\" For this presentation, Doug and his team astonished the audience by not only relating their research, but demonstrating it live. This was the debut of the mouse, interactive computing, hypermedia, computer supported software engineering, video teleconferencing, etc. See also <a href=\"http://dougengelbart.org/firsts/dougs-1968-demo.html\" rel=\"nofollow\">Doug's 1968 Demo page</a> for more background, highlights, links, and the detailed paper published in this conference proceedings. Filmed on 3 reels: Reel 1 | <a href=\"http://www.archive.org/details/XD300-24_68HighlightsAResearchCntAugHumanIntellect\" rel=\"nofollow\">Reel 2</a> | <a href=\"http://www.archive.org/details/XD300-25_68HighlightsAResearchCntAugHumanIntellect\" rel=\"nofollow\">Reel 3</a>",
|
'ext': 'ogv',
|
||||||
"upload_date": "19681210",
|
'title': '1968 Demo - FJCC Conference Presentation Reel #1',
|
||||||
"uploader": "SRI International"
|
'description': 'md5:1780b464abaca9991d8968c877bb53ed',
|
||||||
|
'upload_date': '19681210',
|
||||||
|
'uploader': 'SRI International'
|
||||||
}
|
}
|
||||||
}
|
}, {
|
||||||
|
'url': 'https://archive.org/details/Cops1922',
|
||||||
|
'md5': '18f2a19e6d89af8425671da1cf3d4e04',
|
||||||
|
'info_dict': {
|
||||||
|
'id': 'Cops1922',
|
||||||
|
'ext': 'ogv',
|
||||||
|
'title': 'Buster Keaton\'s "Cops" (1922)',
|
||||||
|
'description': 'md5:70f72ee70882f713d4578725461ffcc3',
|
||||||
|
}
|
||||||
|
}]
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
mobj = re.match(self._VALID_URL, url)
|
video_id = self._match_id(url)
|
||||||
video_id = mobj.group('id')
|
|
||||||
|
|
||||||
json_url = url + ('?' if '?' in url else '&') + 'output=json'
|
json_url = url + ('?' if '?' in url else '&') + 'output=json'
|
||||||
json_data = self._download_webpage(json_url, video_id)
|
data = self._download_json(json_url, video_id)
|
||||||
data = json.loads(json_data)
|
|
||||||
|
|
||||||
title = data['metadata']['title'][0]
|
def get_optional(data_dict, field):
|
||||||
description = data['metadata']['description'][0]
|
return data_dict['metadata'].get(field, [None])[0]
|
||||||
uploader = data['metadata']['creator'][0]
|
|
||||||
upload_date = unified_strdate(data['metadata']['date'][0])
|
title = get_optional(data, 'title')
|
||||||
|
description = get_optional(data, 'description')
|
||||||
|
uploader = get_optional(data, 'creator')
|
||||||
|
upload_date = unified_strdate(get_optional(data, 'date'))
|
||||||
|
|
||||||
formats = [
|
formats = [
|
||||||
{
|
{
|
||||||
|
@ -23,13 +23,7 @@ class ARDMediathekIE(InfoExtractor):
|
|||||||
|
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'http://mediathek.daserste.de/sendungen_a-z/328454_anne-will/22429276_vertrauen-ist-gut-spionieren-ist-besser-geht',
|
'url': 'http://mediathek.daserste.de/sendungen_a-z/328454_anne-will/22429276_vertrauen-ist-gut-spionieren-ist-besser-geht',
|
||||||
'file': '22429276.mp4',
|
'only_matching': True,
|
||||||
'md5': '469751912f1de0816a9fc9df8336476c',
|
|
||||||
'info_dict': {
|
|
||||||
'title': 'Vertrauen ist gut, Spionieren ist besser - Geht so deutsch-amerikanische Freundschaft?',
|
|
||||||
'description': 'Das Erste Mediathek [ARD]: Vertrauen ist gut, Spionieren ist besser - Geht so deutsch-amerikanische Freundschaft?, Anne Will, Über die Spionage-Affäre diskutieren Clemens Binninger, Katrin Göring-Eckardt, Georg Mascolo, Andrew B. Denison und Constanze Kurz.. Das Video zur Sendung Anne Will am Mittwoch, 16.07.2014',
|
|
||||||
},
|
|
||||||
'skip': 'Blocked outside of Germany',
|
|
||||||
}, {
|
}, {
|
||||||
'url': 'http://www.ardmediathek.de/tv/Tatort/Das-Wunder-von-Wolbeck-Video-tgl-ab-20/Das-Erste/Video?documentId=22490580&bcastId=602916',
|
'url': 'http://www.ardmediathek.de/tv/Tatort/Das-Wunder-von-Wolbeck-Video-tgl-ab-20/Das-Erste/Video?documentId=22490580&bcastId=602916',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
|
@ -37,7 +37,7 @@ class ArteTvIE(InfoExtractor):
|
|||||||
config_xml_url, video_id, note='Downloading configuration')
|
config_xml_url, video_id, note='Downloading configuration')
|
||||||
|
|
||||||
formats = [{
|
formats = [{
|
||||||
'forma_id': q.attrib['quality'],
|
'format_id': q.attrib['quality'],
|
||||||
# The playpath starts at 'mp4:', if we don't manually
|
# The playpath starts at 'mp4:', if we don't manually
|
||||||
# split the url, rtmpdump will incorrectly parse them
|
# split the url, rtmpdump will incorrectly parse them
|
||||||
'url': q.text.split('mp4:', 1)[0],
|
'url': q.text.split('mp4:', 1)[0],
|
||||||
@ -133,7 +133,7 @@ class ArteTVPlus7IE(InfoExtractor):
|
|||||||
'width': int_or_none(f.get('width')),
|
'width': int_or_none(f.get('width')),
|
||||||
'height': int_or_none(f.get('height')),
|
'height': int_or_none(f.get('height')),
|
||||||
'tbr': int_or_none(f.get('bitrate')),
|
'tbr': int_or_none(f.get('bitrate')),
|
||||||
'quality': qfunc(f['quality']),
|
'quality': qfunc(f.get('quality')),
|
||||||
'source_preference': source_pref,
|
'source_preference': source_pref,
|
||||||
}
|
}
|
||||||
|
|
||||||
|
163
youtube_dl/extractor/atresplayer.py
Normal file
163
youtube_dl/extractor/atresplayer.py
Normal file
@ -0,0 +1,163 @@
|
|||||||
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
import time
|
||||||
|
import hmac
|
||||||
|
|
||||||
|
from .subtitles import SubtitlesInfoExtractor
|
||||||
|
from ..compat import (
|
||||||
|
compat_str,
|
||||||
|
compat_urllib_parse,
|
||||||
|
compat_urllib_request,
|
||||||
|
)
|
||||||
|
from ..utils import (
|
||||||
|
int_or_none,
|
||||||
|
float_or_none,
|
||||||
|
xpath_text,
|
||||||
|
ExtractorError,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
class AtresPlayerIE(SubtitlesInfoExtractor):
|
||||||
|
_VALID_URL = r'https?://(?:www\.)?atresplayer\.com/television/[^/]+/[^/]+/[^/]+/(?P<id>.+?)_\d+\.html'
|
||||||
|
_TESTS = [
|
||||||
|
{
|
||||||
|
'url': 'http://www.atresplayer.com/television/programas/el-club-de-la-comedia/temporada-4/capitulo-10-especial-solidario-nochebuena_2014122100174.html',
|
||||||
|
'md5': 'efd56753cda1bb64df52a3074f62e38a',
|
||||||
|
'info_dict': {
|
||||||
|
'id': 'capitulo-10-especial-solidario-nochebuena',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'Especial Solidario de Nochebuena',
|
||||||
|
'description': 'md5:e2d52ff12214fa937107d21064075bf1',
|
||||||
|
'duration': 5527.6,
|
||||||
|
'thumbnail': 're:^https?://.*\.jpg$',
|
||||||
|
},
|
||||||
|
},
|
||||||
|
{
|
||||||
|
'url': 'http://www.atresplayer.com/television/series/el-secreto-de-puente-viejo/el-chico-de-los-tres-lunares/capitulo-977-29-12-14_2014122400174.html',
|
||||||
|
'only_matching': True,
|
||||||
|
},
|
||||||
|
]
|
||||||
|
|
||||||
|
_USER_AGENT = 'Dalvik/1.6.0 (Linux; U; Android 4.3; GT-I9300 Build/JSS15J'
|
||||||
|
_MAGIC = 'QWtMLXs414Yo+c#_+Q#K@NN)'
|
||||||
|
_TIMESTAMP_SHIFT = 30000
|
||||||
|
|
||||||
|
_TIME_API_URL = 'http://servicios.atresplayer.com/api/admin/time.json'
|
||||||
|
_URL_VIDEO_TEMPLATE = 'https://servicios.atresplayer.com/api/urlVideo/{1}/{0}/{1}|{2}|{3}.json'
|
||||||
|
_PLAYER_URL_TEMPLATE = 'https://servicios.atresplayer.com/episode/getplayer.json?episodePk=%s'
|
||||||
|
_EPISODE_URL_TEMPLATE = 'http://www.atresplayer.com/episodexml/%s'
|
||||||
|
|
||||||
|
_LOGIN_URL = 'https://servicios.atresplayer.com/j_spring_security_check'
|
||||||
|
|
||||||
|
def _real_initialize(self):
|
||||||
|
self._login()
|
||||||
|
|
||||||
|
def _login(self):
|
||||||
|
(username, password) = self._get_login_info()
|
||||||
|
if username is None:
|
||||||
|
return
|
||||||
|
|
||||||
|
login_form = {
|
||||||
|
'j_username': username,
|
||||||
|
'j_password': password,
|
||||||
|
}
|
||||||
|
|
||||||
|
request = compat_urllib_request.Request(
|
||||||
|
self._LOGIN_URL, compat_urllib_parse.urlencode(login_form).encode('utf-8'))
|
||||||
|
request.add_header('Content-Type', 'application/x-www-form-urlencoded')
|
||||||
|
response = self._download_webpage(
|
||||||
|
request, None, 'Logging in as %s' % username)
|
||||||
|
|
||||||
|
error = self._html_search_regex(
|
||||||
|
r'(?s)<ul class="list_error">(.+?)</ul>', response, 'error', default=None)
|
||||||
|
if error:
|
||||||
|
raise ExtractorError(
|
||||||
|
'Unable to login: %s' % error, expected=True)
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
video_id = self._match_id(url)
|
||||||
|
|
||||||
|
webpage = self._download_webpage(url, video_id)
|
||||||
|
|
||||||
|
episode_id = self._search_regex(
|
||||||
|
r'episode="([^"]+)"', webpage, 'episode id')
|
||||||
|
|
||||||
|
timestamp = int_or_none(self._download_webpage(
|
||||||
|
self._TIME_API_URL,
|
||||||
|
video_id, 'Downloading timestamp', fatal=False), 1000, time.time())
|
||||||
|
timestamp_shifted = compat_str(timestamp + self._TIMESTAMP_SHIFT)
|
||||||
|
token = hmac.new(
|
||||||
|
self._MAGIC.encode('ascii'),
|
||||||
|
(episode_id + timestamp_shifted).encode('utf-8')
|
||||||
|
).hexdigest()
|
||||||
|
|
||||||
|
formats = []
|
||||||
|
for fmt in ['windows', 'android_tablet']:
|
||||||
|
request = compat_urllib_request.Request(
|
||||||
|
self._URL_VIDEO_TEMPLATE.format(fmt, episode_id, timestamp_shifted, token))
|
||||||
|
request.add_header('User-Agent', self._USER_AGENT)
|
||||||
|
|
||||||
|
fmt_json = self._download_json(
|
||||||
|
request, video_id, 'Downloading %s video JSON' % fmt)
|
||||||
|
|
||||||
|
result = fmt_json.get('resultDes')
|
||||||
|
if result.lower() != 'ok':
|
||||||
|
raise ExtractorError(
|
||||||
|
'%s returned error: %s' % (self.IE_NAME, result), expected=True)
|
||||||
|
|
||||||
|
for format_id, video_url in fmt_json['resultObject'].items():
|
||||||
|
if format_id == 'token' or not video_url.startswith('http'):
|
||||||
|
continue
|
||||||
|
if video_url.endswith('/Manifest'):
|
||||||
|
if 'geodeswowsmpra3player' in video_url:
|
||||||
|
f4m_path = video_url.split('smil:', 1)[-1].split('free_', 1)[0]
|
||||||
|
f4m_url = 'http://drg.antena3.com/{0}hds/es/sd.f4m'.format(f4m_path)
|
||||||
|
# this videos are protected by DRM, the f4m downloader doesn't support them
|
||||||
|
continue
|
||||||
|
else:
|
||||||
|
f4m_url = video_url[:-9] + '/manifest.f4m'
|
||||||
|
formats.extend(self._extract_f4m_formats(f4m_url, video_id))
|
||||||
|
else:
|
||||||
|
formats.append({
|
||||||
|
'url': video_url,
|
||||||
|
'format_id': 'android-%s' % format_id,
|
||||||
|
'preference': 1,
|
||||||
|
})
|
||||||
|
self._sort_formats(formats)
|
||||||
|
|
||||||
|
player = self._download_json(
|
||||||
|
self._PLAYER_URL_TEMPLATE % episode_id,
|
||||||
|
episode_id)
|
||||||
|
|
||||||
|
path_data = player.get('pathData')
|
||||||
|
|
||||||
|
episode = self._download_xml(
|
||||||
|
self._EPISODE_URL_TEMPLATE % path_data,
|
||||||
|
video_id, 'Downloading episode XML')
|
||||||
|
|
||||||
|
duration = float_or_none(xpath_text(
|
||||||
|
episode, './media/asset/info/technical/contentDuration', 'duration'))
|
||||||
|
|
||||||
|
art = episode.find('./media/asset/info/art')
|
||||||
|
title = xpath_text(art, './name', 'title')
|
||||||
|
description = xpath_text(art, './description', 'description')
|
||||||
|
thumbnail = xpath_text(episode, './media/asset/files/background', 'thumbnail')
|
||||||
|
|
||||||
|
subtitles = {}
|
||||||
|
subtitle = xpath_text(episode, './media/asset/files/subtitle', 'subtitle')
|
||||||
|
if subtitle:
|
||||||
|
subtitles['es'] = subtitle
|
||||||
|
|
||||||
|
if self._downloader.params.get('listsubtitles', False):
|
||||||
|
self._list_available_subtitles(video_id, subtitles)
|
||||||
|
return
|
||||||
|
|
||||||
|
return {
|
||||||
|
'id': video_id,
|
||||||
|
'title': title,
|
||||||
|
'description': description,
|
||||||
|
'thumbnail': thumbnail,
|
||||||
|
'duration': duration,
|
||||||
|
'formats': formats,
|
||||||
|
'subtitles': self.extract_subtitles(video_id, subtitles),
|
||||||
|
}
|
55
youtube_dl/extractor/atttechchannel.py
Normal file
55
youtube_dl/extractor/atttechchannel.py
Normal file
@ -0,0 +1,55 @@
|
|||||||
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
from .common import InfoExtractor
|
||||||
|
from ..utils import unified_strdate
|
||||||
|
|
||||||
|
|
||||||
|
class ATTTechChannelIE(InfoExtractor):
|
||||||
|
_VALID_URL = r'https?://techchannel\.att\.com/play-video\.cfm/([^/]+/)*(?P<id>.+)'
|
||||||
|
_TEST = {
|
||||||
|
'url': 'http://techchannel.att.com/play-video.cfm/2014/1/27/ATT-Archives-The-UNIX-System-Making-Computers-Easier-to-Use',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '11316',
|
||||||
|
'display_id': 'ATT-Archives-The-UNIX-System-Making-Computers-Easier-to-Use',
|
||||||
|
'ext': 'flv',
|
||||||
|
'title': 'AT&T Archives : The UNIX System: Making Computers Easier to Use',
|
||||||
|
'description': 'A 1982 film about UNIX is the foundation for software in use around Bell Labs and AT&T.',
|
||||||
|
'thumbnail': 're:^https?://.*\.jpg$',
|
||||||
|
'upload_date': '20140127',
|
||||||
|
},
|
||||||
|
'params': {
|
||||||
|
# rtmp download
|
||||||
|
'skip_download': True,
|
||||||
|
},
|
||||||
|
}
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
display_id = self._match_id(url)
|
||||||
|
|
||||||
|
webpage = self._download_webpage(url, display_id)
|
||||||
|
|
||||||
|
video_url = self._search_regex(
|
||||||
|
r"url\s*:\s*'(rtmp://[^']+)'",
|
||||||
|
webpage, 'video URL')
|
||||||
|
|
||||||
|
video_id = self._search_regex(
|
||||||
|
r'mediaid\s*=\s*(\d+)',
|
||||||
|
webpage, 'video id', fatal=False)
|
||||||
|
|
||||||
|
title = self._og_search_title(webpage)
|
||||||
|
description = self._og_search_description(webpage)
|
||||||
|
thumbnail = self._og_search_thumbnail(webpage)
|
||||||
|
upload_date = unified_strdate(self._search_regex(
|
||||||
|
r'[Rr]elease\s+date:\s*(\d{1,2}/\d{1,2}/\d{4})',
|
||||||
|
webpage, 'upload date', fatal=False), False)
|
||||||
|
|
||||||
|
return {
|
||||||
|
'id': video_id,
|
||||||
|
'display_id': display_id,
|
||||||
|
'url': video_url,
|
||||||
|
'ext': 'flv',
|
||||||
|
'title': title,
|
||||||
|
'description': description,
|
||||||
|
'thumbnail': thumbnail,
|
||||||
|
'upload_date': upload_date,
|
||||||
|
}
|
@ -1,11 +1,15 @@
|
|||||||
# coding: utf-8
|
# coding: utf-8
|
||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
import itertools
|
||||||
|
import time
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from .soundcloud import SoundcloudIE
|
from .soundcloud import SoundcloudIE
|
||||||
from ..utils import ExtractorError
|
from ..utils import (
|
||||||
|
ExtractorError,
|
||||||
import time
|
url_basename,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
class AudiomackIE(InfoExtractor):
|
class AudiomackIE(InfoExtractor):
|
||||||
@ -17,53 +21,124 @@ class AudiomackIE(InfoExtractor):
|
|||||||
'url': 'http://www.audiomack.com/song/roosh-williams/extraordinary',
|
'url': 'http://www.audiomack.com/song/roosh-williams/extraordinary',
|
||||||
'info_dict':
|
'info_dict':
|
||||||
{
|
{
|
||||||
'id': 'roosh-williams/extraordinary',
|
'id': '310086',
|
||||||
'ext': 'mp3',
|
'ext': 'mp3',
|
||||||
'title': 'Roosh Williams - Extraordinary'
|
'uploader': 'Roosh Williams',
|
||||||
|
'title': 'Extraordinary'
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
# hosted on soundcloud via audiomack
|
# audiomack wrapper around soundcloud song
|
||||||
{
|
{
|
||||||
|
'add_ie': ['Soundcloud'],
|
||||||
'url': 'http://www.audiomack.com/song/xclusiveszone/take-kare',
|
'url': 'http://www.audiomack.com/song/xclusiveszone/take-kare',
|
||||||
'file': '172419696.mp3',
|
'info_dict': {
|
||||||
|
'id': '172419696',
|
||||||
|
'ext': 'mp3',
|
||||||
|
'description': 'md5:1fc3272ed7a635cce5be1568c2822997',
|
||||||
|
'title': 'Young Thug ft Lil Wayne - Take Kare',
|
||||||
|
'uploader': 'Young Thug World',
|
||||||
|
'upload_date': '20141016',
|
||||||
|
}
|
||||||
|
},
|
||||||
|
]
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
# URLs end with [uploader name]/[uploader title]
|
||||||
|
# this title is whatever the user types in, and is rarely
|
||||||
|
# the proper song title. Real metadata is in the api response
|
||||||
|
album_url_tag = self._match_id(url)
|
||||||
|
|
||||||
|
# Request the extended version of the api for extra fields like artist and title
|
||||||
|
api_response = self._download_json(
|
||||||
|
'http://www.audiomack.com/api/music/url/song/%s?extended=1&_=%d' % (
|
||||||
|
album_url_tag, time.time()),
|
||||||
|
album_url_tag)
|
||||||
|
|
||||||
|
# API is inconsistent with errors
|
||||||
|
if 'url' not in api_response or not api_response['url'] or 'error' in api_response:
|
||||||
|
raise ExtractorError('Invalid url %s', url)
|
||||||
|
|
||||||
|
# Audiomack wraps a lot of soundcloud tracks in their branded wrapper
|
||||||
|
# if so, pass the work off to the soundcloud extractor
|
||||||
|
if SoundcloudIE.suitable(api_response['url']):
|
||||||
|
return {'_type': 'url', 'url': api_response['url'], 'ie_key': 'Soundcloud'}
|
||||||
|
|
||||||
|
return {
|
||||||
|
'id': api_response.get('id', album_url_tag),
|
||||||
|
'uploader': api_response.get('artist'),
|
||||||
|
'title': api_response.get('title'),
|
||||||
|
'url': api_response['url'],
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
class AudiomackAlbumIE(InfoExtractor):
|
||||||
|
_VALID_URL = r'https?://(?:www\.)?audiomack\.com/album/(?P<id>[\w/-]+)'
|
||||||
|
IE_NAME = 'audiomack:album'
|
||||||
|
_TESTS = [
|
||||||
|
# Standard album playlist
|
||||||
|
{
|
||||||
|
'url': 'http://www.audiomack.com/album/flytunezcom/tha-tour-part-2-mixtape',
|
||||||
|
'playlist_count': 15,
|
||||||
'info_dict':
|
'info_dict':
|
||||||
{
|
{
|
||||||
'ext': 'mp3',
|
'id': '812251',
|
||||||
'title': 'Young Thug ft Lil Wayne - Take Kare',
|
'title': 'Tha Tour: Part 2 (Official Mixtape)'
|
||||||
"upload_date": "20141016",
|
}
|
||||||
"description": "New track produced by London On Da Track called “Take Kare\"\n\nhttp://instagram.com/theyoungthugworld\nhttps://www.facebook.com/ThuggerThuggerCashMoney\n",
|
},
|
||||||
"uploader": "Young Thug World"
|
# Album playlist ripped from fakeshoredrive with no metadata
|
||||||
|
{
|
||||||
|
'url': 'http://www.audiomack.com/album/fakeshoredrive/ppp-pistol-p-project',
|
||||||
|
'info_dict': {
|
||||||
|
'title': 'PPP (Pistol P Project)',
|
||||||
|
'id': '837572',
|
||||||
|
},
|
||||||
|
'playlist': [{
|
||||||
|
'info_dict': {
|
||||||
|
'title': 'PPP (Pistol P Project) - 9. Heaven or Hell (CHIMACA) ft Zuse (prod by DJ FU)',
|
||||||
|
'id': '837577',
|
||||||
|
'ext': 'mp3',
|
||||||
|
'uploader': 'Lil Herb a.k.a. G Herbo',
|
||||||
|
}
|
||||||
|
}],
|
||||||
|
'params': {
|
||||||
|
'playliststart': 9,
|
||||||
|
'playlistend': 9,
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
]
|
]
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
video_id = self._match_id(url)
|
# URLs end with [uploader name]/[uploader title]
|
||||||
|
# this title is whatever the user types in, and is rarely
|
||||||
|
# the proper song title. Real metadata is in the api response
|
||||||
|
album_url_tag = self._match_id(url)
|
||||||
|
result = {'_type': 'playlist', 'entries': []}
|
||||||
|
# There is no one endpoint for album metadata - instead it is included/repeated in each song's metadata
|
||||||
|
# Therefore we don't know how many songs the album has and must infi-loop until failure
|
||||||
|
for track_no in itertools.count():
|
||||||
|
# Get song's metadata
|
||||||
|
api_response = self._download_json(
|
||||||
|
'http://www.audiomack.com/api/music/url/album/%s/%d?extended=1&_=%d'
|
||||||
|
% (album_url_tag, track_no, time.time()), album_url_tag,
|
||||||
|
note='Querying song information (%d)' % (track_no + 1))
|
||||||
|
|
||||||
api_response = self._download_json(
|
# Total failure, only occurs when url is totally wrong
|
||||||
"http://www.audiomack.com/api/music/url/song/%s?_=%d" % (
|
# Won't happen in middle of valid playlist (next case)
|
||||||
video_id, time.time()),
|
if 'url' not in api_response or 'error' in api_response:
|
||||||
video_id)
|
raise ExtractorError('Invalid url for track %d of album url %s' % (track_no, url))
|
||||||
|
# URL is good but song id doesn't exist - usually means end of playlist
|
||||||
if "url" not in api_response:
|
elif not api_response['url']:
|
||||||
raise ExtractorError("Unable to deduce api url of song")
|
break
|
||||||
realurl = api_response["url"]
|
else:
|
||||||
|
# Pull out the album metadata and add to result (if it exists)
|
||||||
# Audiomack wraps a lot of soundcloud tracks in their branded wrapper
|
for resultkey, apikey in [('id', 'album_id'), ('title', 'album_title')]:
|
||||||
# - if so, pass the work off to the soundcloud extractor
|
if apikey in api_response and resultkey not in result:
|
||||||
if SoundcloudIE.suitable(realurl):
|
result[resultkey] = api_response[apikey]
|
||||||
return {'_type': 'url', 'url': realurl, 'ie_key': 'Soundcloud'}
|
song_id = url_basename(api_response['url']).rpartition('.')[0]
|
||||||
|
result['entries'].append({
|
||||||
webpage = self._download_webpage(url, video_id)
|
'id': api_response.get('id', song_id),
|
||||||
artist = self._html_search_regex(
|
'uploader': api_response.get('artist'),
|
||||||
r'<span class="artist">(.*?)</span>', webpage, "artist")
|
'title': api_response.get('title', song_id),
|
||||||
songtitle = self._html_search_regex(
|
'url': api_response['url'],
|
||||||
r'<h1 class="profile-title song-title"><span class="artist">.*?</span>(.*?)</h1>',
|
})
|
||||||
webpage, "title")
|
return result
|
||||||
title = artist + " - " + songtitle
|
|
||||||
|
|
||||||
return {
|
|
||||||
'id': video_id,
|
|
||||||
'title': title,
|
|
||||||
'url': realurl,
|
|
||||||
}
|
|
||||||
|
@ -1,54 +0,0 @@
|
|||||||
from __future__ import unicode_literals
|
|
||||||
|
|
||||||
import re
|
|
||||||
|
|
||||||
from .common import InfoExtractor
|
|
||||||
from ..utils import (
|
|
||||||
compat_urllib_parse,
|
|
||||||
determine_ext,
|
|
||||||
ExtractorError,
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
class AUEngineIE(InfoExtractor):
|
|
||||||
_VALID_URL = r'http://(?:www\.)?auengine\.com/embed\.php\?.*?file=(?P<id>[^&]+).*?'
|
|
||||||
|
|
||||||
_TEST = {
|
|
||||||
'url': 'http://auengine.com/embed.php?file=lfvlytY6&w=650&h=370',
|
|
||||||
'md5': '48972bdbcf1a3a2f5533e62425b41d4f',
|
|
||||||
'info_dict': {
|
|
||||||
'id': 'lfvlytY6',
|
|
||||||
'ext': 'mp4',
|
|
||||||
'title': '[Commie]The Legend of the Legendary Heroes - 03 - Replication Eye (Alpha Stigma)[F9410F5A]'
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
def _real_extract(self, url):
|
|
||||||
video_id = self._match_id(url)
|
|
||||||
|
|
||||||
webpage = self._download_webpage(url, video_id)
|
|
||||||
title = self._html_search_regex(r'<title>(?P<title>.+?)</title>', webpage, 'title')
|
|
||||||
title = title.strip()
|
|
||||||
links = re.findall(r'\s(?:file|url):\s*["\']([^\'"]+)["\']', webpage)
|
|
||||||
links = map(compat_urllib_parse.unquote, links)
|
|
||||||
|
|
||||||
thumbnail = None
|
|
||||||
video_url = None
|
|
||||||
for link in links:
|
|
||||||
if link.endswith('.png'):
|
|
||||||
thumbnail = link
|
|
||||||
elif '/videos/' in link:
|
|
||||||
video_url = link
|
|
||||||
if not video_url:
|
|
||||||
raise ExtractorError('Could not find video URL')
|
|
||||||
ext = '.' + determine_ext(video_url)
|
|
||||||
if ext == title[-len(ext):]:
|
|
||||||
title = title[:-len(ext)]
|
|
||||||
|
|
||||||
return {
|
|
||||||
'id': video_id,
|
|
||||||
'url': video_url,
|
|
||||||
'title': title,
|
|
||||||
'thumbnail': thumbnail,
|
|
||||||
'http_referer': 'http://www.auengine.com/flowplayer/flowplayer.commercial-3.2.14.swf',
|
|
||||||
}
|
|
93
youtube_dl/extractor/azubu.py
Normal file
93
youtube_dl/extractor/azubu.py
Normal file
@ -0,0 +1,93 @@
|
|||||||
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
import json
|
||||||
|
|
||||||
|
from .common import InfoExtractor
|
||||||
|
from ..utils import float_or_none
|
||||||
|
|
||||||
|
|
||||||
|
class AzubuIE(InfoExtractor):
|
||||||
|
_VALID_URL = r'https?://(?:www\.)?azubu\.tv/[^/]+#!/play/(?P<id>\d+)'
|
||||||
|
_TESTS = [
|
||||||
|
{
|
||||||
|
'url': 'http://www.azubu.tv/GSL#!/play/15575/2014-hot6-cup-last-big-match-ro8-day-1',
|
||||||
|
'md5': 'a88b42fcf844f29ad6035054bd9ecaf4',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '15575',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': '2014 HOT6 CUP LAST BIG MATCH Ro8 Day 1',
|
||||||
|
'description': 'md5:d06bdea27b8cc4388a90ad35b5c66c01',
|
||||||
|
'thumbnail': 're:^https?://.*\.jpe?g',
|
||||||
|
'timestamp': 1417523507.334,
|
||||||
|
'upload_date': '20141202',
|
||||||
|
'duration': 9988.7,
|
||||||
|
'uploader': 'GSL',
|
||||||
|
'uploader_id': 414310,
|
||||||
|
'view_count': int,
|
||||||
|
},
|
||||||
|
},
|
||||||
|
{
|
||||||
|
'url': 'http://www.azubu.tv/FnaticTV#!/play/9344/-fnatic-at-worlds-2014:-toyz---%22i-love-rekkles,-he-has-amazing-mechanics%22-',
|
||||||
|
'md5': 'b72a871fe1d9f70bd7673769cdb3b925',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '9344',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'Fnatic at Worlds 2014: Toyz - "I love Rekkles, he has amazing mechanics"',
|
||||||
|
'description': 'md5:4a649737b5f6c8b5c5be543e88dc62af',
|
||||||
|
'thumbnail': 're:^https?://.*\.jpe?g',
|
||||||
|
'timestamp': 1410530893.320,
|
||||||
|
'upload_date': '20140912',
|
||||||
|
'duration': 172.385,
|
||||||
|
'uploader': 'FnaticTV',
|
||||||
|
'uploader_id': 272749,
|
||||||
|
'view_count': int,
|
||||||
|
},
|
||||||
|
},
|
||||||
|
]
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
video_id = self._match_id(url)
|
||||||
|
|
||||||
|
data = self._download_json(
|
||||||
|
'http://www.azubu.tv/api/video/%s' % video_id, video_id)['data']
|
||||||
|
|
||||||
|
title = data['title'].strip()
|
||||||
|
description = data['description']
|
||||||
|
thumbnail = data['thumbnail']
|
||||||
|
view_count = data['view_count']
|
||||||
|
uploader = data['user']['username']
|
||||||
|
uploader_id = data['user']['id']
|
||||||
|
|
||||||
|
stream_params = json.loads(data['stream_params'])
|
||||||
|
|
||||||
|
timestamp = float_or_none(stream_params['creationDate'], 1000)
|
||||||
|
duration = float_or_none(stream_params['length'], 1000)
|
||||||
|
|
||||||
|
renditions = stream_params.get('renditions') or []
|
||||||
|
video = stream_params.get('FLVFullLength') or stream_params.get('videoFullLength')
|
||||||
|
if video:
|
||||||
|
renditions.append(video)
|
||||||
|
|
||||||
|
formats = [{
|
||||||
|
'url': fmt['url'],
|
||||||
|
'width': fmt['frameWidth'],
|
||||||
|
'height': fmt['frameHeight'],
|
||||||
|
'vbr': float_or_none(fmt['encodingRate'], 1000),
|
||||||
|
'filesize': fmt['size'],
|
||||||
|
'vcodec': fmt['videoCodec'],
|
||||||
|
'container': fmt['videoContainer'],
|
||||||
|
} for fmt in renditions if fmt['url']]
|
||||||
|
self._sort_formats(formats)
|
||||||
|
|
||||||
|
return {
|
||||||
|
'id': video_id,
|
||||||
|
'title': title,
|
||||||
|
'description': description,
|
||||||
|
'thumbnail': thumbnail,
|
||||||
|
'timestamp': timestamp,
|
||||||
|
'duration': duration,
|
||||||
|
'uploader': uploader,
|
||||||
|
'uploader_id': uploader_id,
|
||||||
|
'view_count': view_count,
|
||||||
|
'formats': formats,
|
||||||
|
}
|
@ -5,7 +5,7 @@ import json
|
|||||||
import itertools
|
import itertools
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from ..utils import (
|
from ..compat import (
|
||||||
compat_urllib_request,
|
compat_urllib_request,
|
||||||
)
|
)
|
||||||
|
|
||||||
@ -50,7 +50,7 @@ class BambuserIE(InfoExtractor):
|
|||||||
'duration': int(info['length']),
|
'duration': int(info['length']),
|
||||||
'view_count': int(info['views_total']),
|
'view_count': int(info['views_total']),
|
||||||
'uploader': info['username'],
|
'uploader': info['username'],
|
||||||
'uploader_id': info['uid'],
|
'uploader_id': info['owner']['uid'],
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
|
@ -4,9 +4,11 @@ import json
|
|||||||
import re
|
import re
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from ..utils import (
|
from ..compat import (
|
||||||
compat_str,
|
compat_str,
|
||||||
compat_urlparse,
|
compat_urlparse,
|
||||||
|
)
|
||||||
|
from ..utils import (
|
||||||
ExtractorError,
|
ExtractorError,
|
||||||
)
|
)
|
||||||
|
|
||||||
@ -70,26 +72,29 @@ class BandcampIE(InfoExtractor):
|
|||||||
|
|
||||||
download_link = m_download.group(1)
|
download_link = m_download.group(1)
|
||||||
video_id = self._search_regex(
|
video_id = self._search_regex(
|
||||||
r'var TralbumData = {.*?id: (?P<id>\d+),?$',
|
r'(?ms)var TralbumData = {.*?id: (?P<id>\d+),?$',
|
||||||
webpage, 'video id', flags=re.MULTILINE | re.DOTALL)
|
webpage, 'video id')
|
||||||
|
|
||||||
download_webpage = self._download_webpage(download_link, video_id, 'Downloading free downloads page')
|
download_webpage = self._download_webpage(download_link, video_id, 'Downloading free downloads page')
|
||||||
# We get the dictionary of the track from some javascript code
|
# We get the dictionary of the track from some javascript code
|
||||||
info = re.search(r'items: (.*?),$', download_webpage, re.MULTILINE).group(1)
|
all_info = self._parse_json(self._search_regex(
|
||||||
info = json.loads(info)[0]
|
r'(?sm)items: (.*?),$', download_webpage, 'items'), video_id)
|
||||||
|
info = all_info[0]
|
||||||
# We pick mp3-320 for now, until format selection can be easily implemented.
|
# We pick mp3-320 for now, until format selection can be easily implemented.
|
||||||
mp3_info = info['downloads']['mp3-320']
|
mp3_info = info['downloads']['mp3-320']
|
||||||
# If we try to use this url it says the link has expired
|
# If we try to use this url it says the link has expired
|
||||||
initial_url = mp3_info['url']
|
initial_url = mp3_info['url']
|
||||||
re_url = r'(?P<server>http://(.*?)\.bandcamp\.com)/download/track\?enc=mp3-320&fsig=(?P<fsig>.*?)&id=(?P<id>.*?)&ts=(?P<ts>.*)$'
|
m_url = re.match(
|
||||||
m_url = re.match(re_url, initial_url)
|
r'(?P<server>http://(.*?)\.bandcamp\.com)/download/track\?enc=mp3-320&fsig=(?P<fsig>.*?)&id=(?P<id>.*?)&ts=(?P<ts>.*)$',
|
||||||
|
initial_url)
|
||||||
# We build the url we will use to get the final track url
|
# We build the url we will use to get the final track url
|
||||||
# This url is build in Bandcamp in the script download_bunde_*.js
|
# This url is build in Bandcamp in the script download_bunde_*.js
|
||||||
request_url = '%s/statdownload/track?enc=mp3-320&fsig=%s&id=%s&ts=%s&.rand=665028774616&.vrs=1' % (m_url.group('server'), m_url.group('fsig'), video_id, m_url.group('ts'))
|
request_url = '%s/statdownload/track?enc=mp3-320&fsig=%s&id=%s&ts=%s&.rand=665028774616&.vrs=1' % (m_url.group('server'), m_url.group('fsig'), video_id, m_url.group('ts'))
|
||||||
final_url_webpage = self._download_webpage(request_url, video_id, 'Requesting download url')
|
final_url_webpage = self._download_webpage(request_url, video_id, 'Requesting download url')
|
||||||
# If we could correctly generate the .rand field the url would be
|
# If we could correctly generate the .rand field the url would be
|
||||||
# in the "download_url" key
|
# in the "download_url" key
|
||||||
final_url = re.search(r'"retry_url":"(.*?)"', final_url_webpage).group(1)
|
final_url = self._search_regex(
|
||||||
|
r'"retry_url":"(.*?)"', final_url_webpage, 'final video URL')
|
||||||
|
|
||||||
return {
|
return {
|
||||||
'id': video_id,
|
'id': video_id,
|
||||||
@ -104,7 +109,7 @@ class BandcampIE(InfoExtractor):
|
|||||||
|
|
||||||
class BandcampAlbumIE(InfoExtractor):
|
class BandcampAlbumIE(InfoExtractor):
|
||||||
IE_NAME = 'Bandcamp:album'
|
IE_NAME = 'Bandcamp:album'
|
||||||
_VALID_URL = r'https?://(?:(?P<subdomain>[^.]+)\.)?bandcamp\.com(?:/album/(?P<title>[^?#]+))'
|
_VALID_URL = r'https?://(?:(?P<subdomain>[^.]+)\.)?bandcamp\.com(?:/album/(?P<album_id>[^?#]+)|/?(?:$|[?#]))'
|
||||||
|
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'http://blazo.bandcamp.com/album/jazz-format-mixtape-vol-1',
|
'url': 'http://blazo.bandcamp.com/album/jazz-format-mixtape-vol-1',
|
||||||
@ -128,36 +133,49 @@ class BandcampAlbumIE(InfoExtractor):
|
|||||||
],
|
],
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'title': 'Jazz Format Mixtape vol.1',
|
'title': 'Jazz Format Mixtape vol.1',
|
||||||
|
'id': 'jazz-format-mixtape-vol-1',
|
||||||
|
'uploader_id': 'blazo',
|
||||||
},
|
},
|
||||||
'params': {
|
'params': {
|
||||||
'playlistend': 2
|
'playlistend': 2
|
||||||
},
|
},
|
||||||
'skip': 'Bandcamp imposes download limits. See test_playlists:test_bandcamp_album for the playlist test'
|
'skip': 'Bandcamp imposes download limits.'
|
||||||
}, {
|
}, {
|
||||||
'url': 'http://nightbringer.bandcamp.com/album/hierophany-of-the-open-grave',
|
'url': 'http://nightbringer.bandcamp.com/album/hierophany-of-the-open-grave',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'title': 'Hierophany of the Open Grave',
|
'title': 'Hierophany of the Open Grave',
|
||||||
|
'uploader_id': 'nightbringer',
|
||||||
|
'id': 'hierophany-of-the-open-grave',
|
||||||
},
|
},
|
||||||
'playlist_mincount': 9,
|
'playlist_mincount': 9,
|
||||||
|
}, {
|
||||||
|
'url': 'http://dotscale.bandcamp.com',
|
||||||
|
'info_dict': {
|
||||||
|
'title': 'Loom',
|
||||||
|
'id': 'dotscale',
|
||||||
|
'uploader_id': 'dotscale',
|
||||||
|
},
|
||||||
|
'playlist_mincount': 7,
|
||||||
}]
|
}]
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
mobj = re.match(self._VALID_URL, url)
|
mobj = re.match(self._VALID_URL, url)
|
||||||
playlist_id = mobj.group('subdomain')
|
uploader_id = mobj.group('subdomain')
|
||||||
title = mobj.group('title')
|
album_id = mobj.group('album_id')
|
||||||
display_id = title or playlist_id
|
playlist_id = album_id or uploader_id
|
||||||
webpage = self._download_webpage(url, display_id)
|
webpage = self._download_webpage(url, playlist_id)
|
||||||
tracks_paths = re.findall(r'<a href="(.*?)" itemprop="url">', webpage)
|
tracks_paths = re.findall(r'<a href="(.*?)" itemprop="url">', webpage)
|
||||||
if not tracks_paths:
|
if not tracks_paths:
|
||||||
raise ExtractorError('The page doesn\'t contain any tracks')
|
raise ExtractorError('The page doesn\'t contain any tracks')
|
||||||
entries = [
|
entries = [
|
||||||
self.url_result(compat_urlparse.urljoin(url, t_path), ie=BandcampIE.ie_key())
|
self.url_result(compat_urlparse.urljoin(url, t_path), ie=BandcampIE.ie_key())
|
||||||
for t_path in tracks_paths]
|
for t_path in tracks_paths]
|
||||||
title = self._search_regex(r'album_title : "(.*?)"', webpage, 'title')
|
title = self._search_regex(
|
||||||
|
r'album_title\s*:\s*"(.*?)"', webpage, 'title', fatal=False)
|
||||||
return {
|
return {
|
||||||
'_type': 'playlist',
|
'_type': 'playlist',
|
||||||
|
'uploader_id': uploader_id,
|
||||||
'id': playlist_id,
|
'id': playlist_id,
|
||||||
'display_id': display_id,
|
|
||||||
'title': title,
|
'title': title,
|
||||||
'entries': entries,
|
'entries': entries,
|
||||||
}
|
}
|
||||||
|
@ -1,15 +1,16 @@
|
|||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
import re
|
import xml.etree.ElementTree
|
||||||
|
|
||||||
from .subtitles import SubtitlesInfoExtractor
|
from .subtitles import SubtitlesInfoExtractor
|
||||||
from ..utils import ExtractorError
|
from ..utils import ExtractorError
|
||||||
|
from ..compat import compat_HTTPError
|
||||||
|
|
||||||
|
|
||||||
class BBCCoUkIE(SubtitlesInfoExtractor):
|
class BBCCoUkIE(SubtitlesInfoExtractor):
|
||||||
IE_NAME = 'bbc.co.uk'
|
IE_NAME = 'bbc.co.uk'
|
||||||
IE_DESC = 'BBC iPlayer'
|
IE_DESC = 'BBC iPlayer'
|
||||||
_VALID_URL = r'https?://(?:www\.)?bbc\.co\.uk/(?:programmes|iplayer/episode)/(?P<id>[\da-z]{8})'
|
_VALID_URL = r'https?://(?:www\.)?bbc\.co\.uk/(?:(?:(?:programmes|iplayer(?:/[^/]+)?/(?:episode|playlist))/)|music/clips[/#])(?P<id>[\da-z]{8})'
|
||||||
|
|
||||||
_TESTS = [
|
_TESTS = [
|
||||||
{
|
{
|
||||||
@ -17,8 +18,8 @@ class BBCCoUkIE(SubtitlesInfoExtractor):
|
|||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'b039d07m',
|
'id': 'b039d07m',
|
||||||
'ext': 'flv',
|
'ext': 'flv',
|
||||||
'title': 'Kaleidoscope: Leonard Cohen',
|
'title': 'Kaleidoscope, Leonard Cohen',
|
||||||
'description': 'md5:db4755d7a665ae72343779f7dacb402c',
|
'description': 'The Canadian poet and songwriter reflects on his musical career.',
|
||||||
'duration': 1740,
|
'duration': 1740,
|
||||||
},
|
},
|
||||||
'params': {
|
'params': {
|
||||||
@ -55,6 +56,71 @@ class BBCCoUkIE(SubtitlesInfoExtractor):
|
|||||||
'skip_download': True,
|
'skip_download': True,
|
||||||
},
|
},
|
||||||
'skip': 'Currently BBC iPlayer TV programmes are available to play in the UK only',
|
'skip': 'Currently BBC iPlayer TV programmes are available to play in the UK only',
|
||||||
|
},
|
||||||
|
{
|
||||||
|
'url': 'http://www.bbc.co.uk/iplayer/episode/p026c7jt/tomorrows-worlds-the-unearthly-history-of-science-fiction-2-invasion',
|
||||||
|
'info_dict': {
|
||||||
|
'id': 'b03k3pb7',
|
||||||
|
'ext': 'flv',
|
||||||
|
'title': "Tomorrow's Worlds: The Unearthly History of Science Fiction",
|
||||||
|
'description': '2. Invasion',
|
||||||
|
'duration': 3600,
|
||||||
|
},
|
||||||
|
'params': {
|
||||||
|
# rtmp download
|
||||||
|
'skip_download': True,
|
||||||
|
},
|
||||||
|
'skip': 'Currently BBC iPlayer TV programmes are available to play in the UK only',
|
||||||
|
}, {
|
||||||
|
'url': 'http://www.bbc.co.uk/programmes/b04v20dw',
|
||||||
|
'info_dict': {
|
||||||
|
'id': 'b04v209v',
|
||||||
|
'ext': 'flv',
|
||||||
|
'title': 'Pete Tong, The Essential New Tune Special',
|
||||||
|
'description': "Pete has a very special mix - all of 2014's Essential New Tunes!",
|
||||||
|
'duration': 10800,
|
||||||
|
},
|
||||||
|
'params': {
|
||||||
|
# rtmp download
|
||||||
|
'skip_download': True,
|
||||||
|
}
|
||||||
|
}, {
|
||||||
|
'url': 'http://www.bbc.co.uk/music/clips/p02frcc3',
|
||||||
|
'note': 'Audio',
|
||||||
|
'info_dict': {
|
||||||
|
'id': 'p02frcch',
|
||||||
|
'ext': 'flv',
|
||||||
|
'title': 'Pete Tong, Past, Present and Future Special, Madeon - After Hours mix',
|
||||||
|
'description': 'French house superstar Madeon takes us out of the club and onto the after party.',
|
||||||
|
'duration': 3507,
|
||||||
|
},
|
||||||
|
'params': {
|
||||||
|
# rtmp download
|
||||||
|
'skip_download': True,
|
||||||
|
}
|
||||||
|
}, {
|
||||||
|
'url': 'http://www.bbc.co.uk/music/clips/p025c0zz',
|
||||||
|
'note': 'Video',
|
||||||
|
'info_dict': {
|
||||||
|
'id': 'p025c103',
|
||||||
|
'ext': 'flv',
|
||||||
|
'title': 'Reading and Leeds Festival, 2014, Rae Morris - Closer (Live on BBC Three)',
|
||||||
|
'description': 'Rae Morris performs Closer for BBC Three at Reading 2014',
|
||||||
|
'duration': 226,
|
||||||
|
},
|
||||||
|
'params': {
|
||||||
|
# rtmp download
|
||||||
|
'skip_download': True,
|
||||||
|
}
|
||||||
|
}, {
|
||||||
|
'url': 'http://www.bbc.co.uk/iplayer/playlist/p01dvks4',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
'url': 'http://www.bbc.co.uk/music/clips#p02frcc3',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
'url': 'http://www.bbc.co.uk/iplayer/cbeebies/episode/b0480276/bing-14-atchoo',
|
||||||
|
'only_matching': True,
|
||||||
}
|
}
|
||||||
]
|
]
|
||||||
|
|
||||||
@ -102,6 +168,10 @@ class BBCCoUkIE(SubtitlesInfoExtractor):
|
|||||||
return playlist.findall('./{http://bbc.co.uk/2008/emp/playlist}item')
|
return playlist.findall('./{http://bbc.co.uk/2008/emp/playlist}item')
|
||||||
|
|
||||||
def _extract_medias(self, media_selection):
|
def _extract_medias(self, media_selection):
|
||||||
|
error = media_selection.find('./{http://bbc.co.uk/2008/mp/mediaselection}error')
|
||||||
|
if error is not None:
|
||||||
|
raise ExtractorError(
|
||||||
|
'%s returned error: %s' % (self.IE_NAME, error.get('id')), expected=True)
|
||||||
return media_selection.findall('./{http://bbc.co.uk/2008/mp/mediaselection}media')
|
return media_selection.findall('./{http://bbc.co.uk/2008/mp/mediaselection}media')
|
||||||
|
|
||||||
def _extract_connections(self, media):
|
def _extract_connections(self, media):
|
||||||
@ -158,54 +228,101 @@ class BBCCoUkIE(SubtitlesInfoExtractor):
|
|||||||
subtitles[lang] = srt
|
subtitles[lang] = srt
|
||||||
return subtitles
|
return subtitles
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _download_media_selector(self, programme_id):
|
||||||
mobj = re.match(self._VALID_URL, url)
|
try:
|
||||||
group_id = mobj.group('id')
|
media_selection = self._download_xml(
|
||||||
|
'http://open.live.bbc.co.uk/mediaselector/5/select/version/2.0/mediaset/pc/vpid/%s' % programme_id,
|
||||||
|
programme_id, 'Downloading media selection XML')
|
||||||
|
except ExtractorError as ee:
|
||||||
|
if isinstance(ee.cause, compat_HTTPError) and ee.cause.code == 403:
|
||||||
|
media_selection = xml.etree.ElementTree.fromstring(ee.cause.read().encode('utf-8'))
|
||||||
|
else:
|
||||||
|
raise
|
||||||
|
|
||||||
webpage = self._download_webpage(url, group_id, 'Downloading video page')
|
formats = []
|
||||||
if re.search(r'id="emp-error" class="notinuk">', webpage):
|
subtitles = None
|
||||||
raise ExtractorError('Currently BBC iPlayer TV programmes are available to play in the UK only',
|
|
||||||
expected=True)
|
|
||||||
|
|
||||||
playlist = self._download_xml('http://www.bbc.co.uk/iplayer/playlist/%s' % group_id, group_id,
|
for media in self._extract_medias(media_selection):
|
||||||
'Downloading playlist XML')
|
kind = media.get('kind')
|
||||||
|
if kind == 'audio':
|
||||||
|
formats.extend(self._extract_audio(media, programme_id))
|
||||||
|
elif kind == 'video':
|
||||||
|
formats.extend(self._extract_video(media, programme_id))
|
||||||
|
elif kind == 'captions':
|
||||||
|
subtitles = self._extract_captions(media, programme_id)
|
||||||
|
|
||||||
|
return formats, subtitles
|
||||||
|
|
||||||
|
def _download_playlist(self, playlist_id):
|
||||||
|
try:
|
||||||
|
playlist = self._download_json(
|
||||||
|
'http://www.bbc.co.uk/programmes/%s/playlist.json' % playlist_id,
|
||||||
|
playlist_id, 'Downloading playlist JSON')
|
||||||
|
|
||||||
|
version = playlist.get('defaultAvailableVersion')
|
||||||
|
if version:
|
||||||
|
smp_config = version['smpConfig']
|
||||||
|
title = smp_config['title']
|
||||||
|
description = smp_config['summary']
|
||||||
|
for item in smp_config['items']:
|
||||||
|
kind = item['kind']
|
||||||
|
if kind != 'programme' and kind != 'radioProgramme':
|
||||||
|
continue
|
||||||
|
programme_id = item.get('vpid')
|
||||||
|
duration = int(item.get('duration'))
|
||||||
|
formats, subtitles = self._download_media_selector(programme_id)
|
||||||
|
return programme_id, title, description, duration, formats, subtitles
|
||||||
|
except ExtractorError as ee:
|
||||||
|
if not (isinstance(ee.cause, compat_HTTPError) and ee.cause.code == 404):
|
||||||
|
raise
|
||||||
|
|
||||||
|
# fallback to legacy playlist
|
||||||
|
playlist = self._download_xml(
|
||||||
|
'http://www.bbc.co.uk/iplayer/playlist/%s' % playlist_id,
|
||||||
|
playlist_id, 'Downloading legacy playlist XML')
|
||||||
|
|
||||||
no_items = playlist.find('./{http://bbc.co.uk/2008/emp/playlist}noItems')
|
no_items = playlist.find('./{http://bbc.co.uk/2008/emp/playlist}noItems')
|
||||||
if no_items is not None:
|
if no_items is not None:
|
||||||
reason = no_items.get('reason')
|
reason = no_items.get('reason')
|
||||||
if reason == 'preAvailability':
|
if reason == 'preAvailability':
|
||||||
msg = 'Episode %s is not yet available' % group_id
|
msg = 'Episode %s is not yet available' % playlist_id
|
||||||
elif reason == 'postAvailability':
|
elif reason == 'postAvailability':
|
||||||
msg = 'Episode %s is no longer available' % group_id
|
msg = 'Episode %s is no longer available' % playlist_id
|
||||||
|
elif reason == 'noMedia':
|
||||||
|
msg = 'Episode %s is not currently available' % playlist_id
|
||||||
else:
|
else:
|
||||||
msg = 'Episode %s is not available: %s' % (group_id, reason)
|
msg = 'Episode %s is not available: %s' % (playlist_id, reason)
|
||||||
raise ExtractorError(msg, expected=True)
|
raise ExtractorError(msg, expected=True)
|
||||||
|
|
||||||
formats = []
|
|
||||||
subtitles = None
|
|
||||||
|
|
||||||
for item in self._extract_items(playlist):
|
for item in self._extract_items(playlist):
|
||||||
kind = item.get('kind')
|
kind = item.get('kind')
|
||||||
if kind != 'programme' and kind != 'radioProgramme':
|
if kind != 'programme' and kind != 'radioProgramme':
|
||||||
continue
|
continue
|
||||||
title = playlist.find('./{http://bbc.co.uk/2008/emp/playlist}title').text
|
title = playlist.find('./{http://bbc.co.uk/2008/emp/playlist}title').text
|
||||||
description = playlist.find('./{http://bbc.co.uk/2008/emp/playlist}summary').text
|
description = playlist.find('./{http://bbc.co.uk/2008/emp/playlist}summary').text
|
||||||
|
|
||||||
programme_id = item.get('identifier')
|
programme_id = item.get('identifier')
|
||||||
duration = int(item.get('duration'))
|
duration = int(item.get('duration'))
|
||||||
|
formats, subtitles = self._download_media_selector(programme_id)
|
||||||
|
|
||||||
media_selection = self._download_xml(
|
return programme_id, title, description, duration, formats, subtitles
|
||||||
'http://open.live.bbc.co.uk/mediaselector/5/select/version/2.0/mediaset/pc/vpid/%s' % programme_id,
|
|
||||||
programme_id, 'Downloading media selection XML')
|
|
||||||
|
|
||||||
for media in self._extract_medias(media_selection):
|
def _real_extract(self, url):
|
||||||
kind = media.get('kind')
|
group_id = self._match_id(url)
|
||||||
if kind == 'audio':
|
|
||||||
formats.extend(self._extract_audio(media, programme_id))
|
webpage = self._download_webpage(url, group_id, 'Downloading video page')
|
||||||
elif kind == 'video':
|
|
||||||
formats.extend(self._extract_video(media, programme_id))
|
programme_id = self._search_regex(
|
||||||
elif kind == 'captions':
|
r'"vpid"\s*:\s*"([\da-z]{8})"', webpage, 'vpid', fatal=False, default=None)
|
||||||
subtitles = self._extract_captions(media, programme_id)
|
if programme_id:
|
||||||
|
player = self._download_json(
|
||||||
|
'http://www.bbc.co.uk/iplayer/episode/%s.json' % group_id,
|
||||||
|
group_id)['jsConf']['player']
|
||||||
|
title = player['title']
|
||||||
|
description = player['subtitle']
|
||||||
|
duration = player['duration']
|
||||||
|
formats, subtitles = self._download_media_selector(programme_id)
|
||||||
|
else:
|
||||||
|
programme_id, title, description, duration, formats, subtitles = self._download_playlist(group_id)
|
||||||
|
|
||||||
if self._downloader.params.get('listsubtitles', False):
|
if self._downloader.params.get('listsubtitles', False):
|
||||||
self._list_available_subtitles(programme_id, subtitles)
|
self._list_available_subtitles(programme_id, subtitles)
|
||||||
|
@ -9,7 +9,7 @@ class BeegIE(InfoExtractor):
|
|||||||
_VALID_URL = r'https?://(?:www\.)?beeg\.com/(?P<id>\d+)'
|
_VALID_URL = r'https?://(?:www\.)?beeg\.com/(?P<id>\d+)'
|
||||||
_TEST = {
|
_TEST = {
|
||||||
'url': 'http://beeg.com/5416503',
|
'url': 'http://beeg.com/5416503',
|
||||||
'md5': '634526ae978711f6b748fe0dd6c11f57',
|
'md5': '1bff67111adb785c51d1b42959ec10e5',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '5416503',
|
'id': '5416503',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
|
@ -10,15 +10,15 @@ from ..utils import url_basename
|
|||||||
class BehindKinkIE(InfoExtractor):
|
class BehindKinkIE(InfoExtractor):
|
||||||
_VALID_URL = r'http://(?:www\.)?behindkink\.com/(?P<year>[0-9]{4})/(?P<month>[0-9]{2})/(?P<day>[0-9]{2})/(?P<id>[^/#?_]+)'
|
_VALID_URL = r'http://(?:www\.)?behindkink\.com/(?P<year>[0-9]{4})/(?P<month>[0-9]{2})/(?P<day>[0-9]{2})/(?P<id>[^/#?_]+)'
|
||||||
_TEST = {
|
_TEST = {
|
||||||
'url': 'http://www.behindkink.com/2014/08/14/ab1576-performers-voice-finally-heard-the-bill-is-killed/',
|
'url': 'http://www.behindkink.com/2014/12/05/what-are-you-passionate-about-marley-blaze/',
|
||||||
'md5': '41ad01222b8442089a55528fec43ec01',
|
'md5': '507b57d8fdcd75a41a9a7bdb7989c762',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '36370',
|
'id': '37127',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': 'AB1576 - PERFORMERS VOICE FINALLY HEARD - THE BILL IS KILLED!',
|
'title': 'What are you passionate about – Marley Blaze',
|
||||||
'description': 'The adult industry voice was finally heard as Assembly Bill 1576 remained\xa0 in suspense today at the Senate Appropriations Hearing. AB1576 was, among other industry damaging issues, a condom mandate...',
|
'description': 'md5:aee8e9611b4ff70186f752975d9b94b4',
|
||||||
'upload_date': '20140814',
|
'upload_date': '20141205',
|
||||||
'thumbnail': 'http://www.behindkink.com/wp-content/uploads/2014/08/36370_AB1576_Win.jpg',
|
'thumbnail': 'http://www.behindkink.com/wp-content/uploads/2014/12/blaze-1.jpg',
|
||||||
'age_limit': 18,
|
'age_limit': 18,
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
@ -26,26 +26,19 @@ class BehindKinkIE(InfoExtractor):
|
|||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
mobj = re.match(self._VALID_URL, url)
|
mobj = re.match(self._VALID_URL, url)
|
||||||
display_id = mobj.group('id')
|
display_id = mobj.group('id')
|
||||||
year = mobj.group('year')
|
|
||||||
month = mobj.group('month')
|
|
||||||
day = mobj.group('day')
|
|
||||||
upload_date = year + month + day
|
|
||||||
|
|
||||||
webpage = self._download_webpage(url, display_id)
|
webpage = self._download_webpage(url, display_id)
|
||||||
|
|
||||||
video_url = self._search_regex(
|
video_url = self._search_regex(
|
||||||
r"'file':\s*'([^']+)'",
|
r'<source src="([^"]+)"', webpage, 'video URL')
|
||||||
webpage, 'URL base')
|
video_id = url_basename(video_url).split('_')[0]
|
||||||
|
upload_date = mobj.group('year') + mobj.group('month') + mobj.group('day')
|
||||||
video_id = url_basename(video_url)
|
|
||||||
video_id = video_id.split('_')[0]
|
|
||||||
|
|
||||||
return {
|
return {
|
||||||
'id': video_id,
|
'id': video_id,
|
||||||
'url': video_url,
|
|
||||||
'ext': 'mp4',
|
|
||||||
'title': self._og_search_title(webpage),
|
|
||||||
'display_id': display_id,
|
'display_id': display_id,
|
||||||
|
'url': video_url,
|
||||||
|
'title': self._og_search_title(webpage),
|
||||||
'thumbnail': self._og_search_thumbnail(webpage),
|
'thumbnail': self._og_search_thumbnail(webpage),
|
||||||
'description': self._og_search_description(webpage),
|
'description': self._og_search_description(webpage),
|
||||||
'upload_date': upload_date,
|
'upload_date': upload_date,
|
||||||
|
107
youtube_dl/extractor/bet.py
Normal file
107
youtube_dl/extractor/bet.py
Normal file
@ -0,0 +1,107 @@
|
|||||||
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
from .common import InfoExtractor
|
||||||
|
from ..compat import compat_urllib_parse
|
||||||
|
from ..utils import (
|
||||||
|
xpath_text,
|
||||||
|
xpath_with_ns,
|
||||||
|
int_or_none,
|
||||||
|
parse_iso8601,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
class BetIE(InfoExtractor):
|
||||||
|
_VALID_URL = r'https?://(?:www\.)?bet\.com/(?:[^/]+/)+(?P<id>.+?)\.html'
|
||||||
|
_TESTS = [
|
||||||
|
{
|
||||||
|
'url': 'http://www.bet.com/news/politics/2014/12/08/in-bet-exclusive-obama-talks-race-and-racism.html',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '740ab250-bb94-4a8a-8787-fe0de7c74471',
|
||||||
|
'display_id': 'in-bet-exclusive-obama-talks-race-and-racism',
|
||||||
|
'ext': 'flv',
|
||||||
|
'title': 'BET News Presents: A Conversation With President Obama',
|
||||||
|
'description': 'md5:5a88d8ae912c1b33e090290af7ec33c6',
|
||||||
|
'duration': 1534,
|
||||||
|
'timestamp': 1418075340,
|
||||||
|
'upload_date': '20141208',
|
||||||
|
'uploader': 'admin',
|
||||||
|
'thumbnail': 're:(?i)^https?://.*\.jpg$',
|
||||||
|
},
|
||||||
|
'params': {
|
||||||
|
# rtmp download
|
||||||
|
'skip_download': True,
|
||||||
|
},
|
||||||
|
},
|
||||||
|
{
|
||||||
|
'url': 'http://www.bet.com/video/news/national/2014/justice-for-ferguson-a-community-reacts.html',
|
||||||
|
'info_dict': {
|
||||||
|
'id': 'bcd1b1df-673a-42cf-8d01-b282db608f2d',
|
||||||
|
'display_id': 'justice-for-ferguson-a-community-reacts',
|
||||||
|
'ext': 'flv',
|
||||||
|
'title': 'Justice for Ferguson: A Community Reacts',
|
||||||
|
'description': 'A BET News special.',
|
||||||
|
'duration': 1696,
|
||||||
|
'timestamp': 1416942360,
|
||||||
|
'upload_date': '20141125',
|
||||||
|
'uploader': 'admin',
|
||||||
|
'thumbnail': 're:(?i)^https?://.*\.jpg$',
|
||||||
|
},
|
||||||
|
'params': {
|
||||||
|
# rtmp download
|
||||||
|
'skip_download': True,
|
||||||
|
},
|
||||||
|
}
|
||||||
|
]
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
display_id = self._match_id(url)
|
||||||
|
webpage = self._download_webpage(url, display_id)
|
||||||
|
|
||||||
|
media_url = compat_urllib_parse.unquote(self._search_regex(
|
||||||
|
[r'mediaURL\s*:\s*"([^"]+)"', r"var\s+mrssMediaUrl\s*=\s*'([^']+)'"],
|
||||||
|
webpage, 'media URL'))
|
||||||
|
|
||||||
|
mrss = self._download_xml(media_url, display_id)
|
||||||
|
|
||||||
|
item = mrss.find('./channel/item')
|
||||||
|
|
||||||
|
NS_MAP = {
|
||||||
|
'dc': 'http://purl.org/dc/elements/1.1/',
|
||||||
|
'media': 'http://search.yahoo.com/mrss/',
|
||||||
|
'ka': 'http://kickapps.com/karss',
|
||||||
|
}
|
||||||
|
|
||||||
|
title = xpath_text(item, './title', 'title')
|
||||||
|
description = xpath_text(
|
||||||
|
item, './description', 'description', fatal=False)
|
||||||
|
|
||||||
|
video_id = xpath_text(item, './guid', 'video id', fatal=False)
|
||||||
|
|
||||||
|
timestamp = parse_iso8601(xpath_text(
|
||||||
|
item, xpath_with_ns('./dc:date', NS_MAP),
|
||||||
|
'upload date', fatal=False))
|
||||||
|
uploader = xpath_text(
|
||||||
|
item, xpath_with_ns('./dc:creator', NS_MAP),
|
||||||
|
'uploader', fatal=False)
|
||||||
|
|
||||||
|
media_content = item.find(
|
||||||
|
xpath_with_ns('./media:content', NS_MAP))
|
||||||
|
duration = int_or_none(media_content.get('duration'))
|
||||||
|
smil_url = media_content.get('url')
|
||||||
|
|
||||||
|
thumbnail = media_content.find(
|
||||||
|
xpath_with_ns('./media:thumbnail', NS_MAP)).get('url')
|
||||||
|
|
||||||
|
formats = self._extract_smil_formats(smil_url, display_id)
|
||||||
|
|
||||||
|
return {
|
||||||
|
'id': video_id,
|
||||||
|
'display_id': display_id,
|
||||||
|
'title': title,
|
||||||
|
'description': description,
|
||||||
|
'thumbnail': thumbnail,
|
||||||
|
'timestamp': timestamp,
|
||||||
|
'uploader': uploader,
|
||||||
|
'duration': duration,
|
||||||
|
'formats': formats,
|
||||||
|
}
|
@ -5,8 +5,6 @@ import re
|
|||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
compat_parse_qs,
|
|
||||||
ExtractorError,
|
|
||||||
int_or_none,
|
int_or_none,
|
||||||
unified_strdate,
|
unified_strdate,
|
||||||
)
|
)
|
||||||
@ -29,10 +27,9 @@ class BiliBiliIE(InfoExtractor):
|
|||||||
}
|
}
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
mobj = re.match(self._VALID_URL, url)
|
video_id = self._match_id(url)
|
||||||
video_id = mobj.group('id')
|
|
||||||
|
|
||||||
webpage = self._download_webpage(url, video_id)
|
webpage = self._download_webpage(url, video_id)
|
||||||
|
|
||||||
video_code = self._search_regex(
|
video_code = self._search_regex(
|
||||||
r'(?s)<div itemprop="video".*?>(.*?)</div>', webpage, 'video code')
|
r'(?s)<div itemprop="video".*?>(.*?)</div>', webpage, 'video code')
|
||||||
|
|
||||||
@ -55,45 +52,38 @@ class BiliBiliIE(InfoExtractor):
|
|||||||
thumbnail = self._html_search_meta(
|
thumbnail = self._html_search_meta(
|
||||||
'thumbnailUrl', video_code, 'thumbnail', fatal=False)
|
'thumbnailUrl', video_code, 'thumbnail', fatal=False)
|
||||||
|
|
||||||
player_params = compat_parse_qs(self._html_search_regex(
|
cid = self._search_regex(r'cid=(\d+)', webpage, 'cid')
|
||||||
r'<iframe .*?class="player" src="https://secure\.bilibili\.(?:tv|com)/secure,([^"]+)"',
|
|
||||||
webpage, 'player params'))
|
|
||||||
|
|
||||||
if 'cid' in player_params:
|
lq_doc = self._download_xml(
|
||||||
cid = player_params['cid'][0]
|
'http://interface.bilibili.com/v_cdn_play?appkey=1&cid=%s' % cid,
|
||||||
|
video_id,
|
||||||
|
note='Downloading LQ video info'
|
||||||
|
)
|
||||||
|
lq_durl = lq_doc.find('./durl')
|
||||||
|
formats = [{
|
||||||
|
'format_id': 'lq',
|
||||||
|
'quality': 1,
|
||||||
|
'url': lq_durl.find('./url').text,
|
||||||
|
'filesize': int_or_none(
|
||||||
|
lq_durl.find('./size'), get_attr='text'),
|
||||||
|
}]
|
||||||
|
|
||||||
lq_doc = self._download_xml(
|
hq_doc = self._download_xml(
|
||||||
'http://interface.bilibili.cn/v_cdn_play?cid=%s' % cid,
|
'http://interface.bilibili.com/playurl?appkey=1&cid=%s' % cid,
|
||||||
video_id,
|
video_id,
|
||||||
note='Downloading LQ video info'
|
note='Downloading HQ video info',
|
||||||
)
|
fatal=False,
|
||||||
lq_durl = lq_doc.find('.//durl')
|
)
|
||||||
formats = [{
|
if hq_doc is not False:
|
||||||
'format_id': 'lq',
|
hq_durl = hq_doc.find('./durl')
|
||||||
'quality': 1,
|
formats.append({
|
||||||
'url': lq_durl.find('./url').text,
|
'format_id': 'hq',
|
||||||
|
'quality': 2,
|
||||||
|
'ext': 'flv',
|
||||||
|
'url': hq_durl.find('./url').text,
|
||||||
'filesize': int_or_none(
|
'filesize': int_or_none(
|
||||||
lq_durl.find('./size'), get_attr='text'),
|
hq_durl.find('./size'), get_attr='text'),
|
||||||
}]
|
})
|
||||||
|
|
||||||
hq_doc = self._download_xml(
|
|
||||||
'http://interface.bilibili.cn/playurl?cid=%s' % cid,
|
|
||||||
video_id,
|
|
||||||
note='Downloading HQ video info',
|
|
||||||
fatal=False,
|
|
||||||
)
|
|
||||||
if hq_doc is not False:
|
|
||||||
hq_durl = hq_doc.find('.//durl')
|
|
||||||
formats.append({
|
|
||||||
'format_id': 'hq',
|
|
||||||
'quality': 2,
|
|
||||||
'ext': 'flv',
|
|
||||||
'url': hq_durl.find('./url').text,
|
|
||||||
'filesize': int_or_none(
|
|
||||||
hq_durl.find('./size'), get_attr='text'),
|
|
||||||
})
|
|
||||||
else:
|
|
||||||
raise ExtractorError('Unsupported player parameters: %r' % (player_params,))
|
|
||||||
|
|
||||||
self._sort_formats(formats)
|
self._sort_formats(formats)
|
||||||
return {
|
return {
|
||||||
|
@ -1,40 +1,35 @@
|
|||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
import json
|
import json
|
||||||
import re
|
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from ..utils import remove_start
|
from ..utils import (
|
||||||
|
remove_start,
|
||||||
|
int_or_none,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
class BlinkxIE(InfoExtractor):
|
class BlinkxIE(InfoExtractor):
|
||||||
_VALID_URL = r'^(?:https?://(?:www\.)blinkx\.com/#?ce/|blinkx:)(?P<id>[^?]+)'
|
_VALID_URL = r'(?:https?://(?:www\.)blinkx\.com/#?ce/|blinkx:)(?P<id>[^?]+)'
|
||||||
IE_NAME = 'blinkx'
|
IE_NAME = 'blinkx'
|
||||||
|
|
||||||
_TEST = {
|
_TEST = {
|
||||||
'url': 'http://www.blinkx.com/ce/8aQUy7GVFYgFzpKhT0oqsilwOGFRVXk3R1ZGWWdGenBLaFQwb3FzaWx3OGFRVXk3R1ZGWWdGenB',
|
'url': 'http://www.blinkx.com/ce/Da0Gw3xc5ucpNduzLuDDlv4WC9PuI4fDi1-t6Y3LyfdY2SZS5Urbvn-UPJvrvbo8LTKTc67Wu2rPKSQDJyZeeORCR8bYkhs8lI7eqddznH2ofh5WEEdjYXnoRtj7ByQwt7atMErmXIeYKPsSDuMAAqJDlQZ-3Ff4HJVeH_s3Gh8oQ',
|
||||||
'md5': '2e9a07364af40163a908edbf10bb2492',
|
'md5': '337cf7a344663ec79bf93a526a2e06c7',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '8aQUy7GV',
|
'id': 'Da0Gw3xc',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': 'Police Car Rolls Away',
|
'title': 'No Daily Show for John Oliver; HBO Show Renewed - IGN News',
|
||||||
'uploader': 'stupidvideos.com',
|
'uploader': 'IGN News',
|
||||||
'upload_date': '20131215',
|
'upload_date': '20150217',
|
||||||
'timestamp': 1387068000,
|
'timestamp': 1424215740,
|
||||||
'description': 'A police car gently rolls away from a fight. Maybe it felt weird being around a confrontation and just had to get out of there!',
|
'description': 'HBO has renewed Last Week Tonight With John Oliver for two more seasons.',
|
||||||
'duration': 14.886,
|
'duration': 47.743333,
|
||||||
'thumbnails': [{
|
|
||||||
'width': 100,
|
|
||||||
'height': 76,
|
|
||||||
'resolution': '100x76',
|
|
||||||
'url': 'http://cdn.blinkx.com/stream/b/41/StupidVideos/20131215/1873969261/1873969261_tn_0.jpg',
|
|
||||||
}],
|
|
||||||
},
|
},
|
||||||
}
|
}
|
||||||
|
|
||||||
def _real_extract(self, rl):
|
def _real_extract(self, url):
|
||||||
m = re.match(self._VALID_URL, rl)
|
video_id = self._match_id(url)
|
||||||
video_id = m.group('id')
|
|
||||||
display_id = video_id[:8]
|
display_id = video_id[:8]
|
||||||
|
|
||||||
api_url = ('https://apib4.blinkx.com/api.php?action=play_video&' +
|
api_url = ('https://apib4.blinkx.com/api.php?action=play_video&' +
|
||||||
@ -60,18 +55,20 @@ class BlinkxIE(InfoExtractor):
|
|||||||
elif m['type'] in ('flv', 'mp4'):
|
elif m['type'] in ('flv', 'mp4'):
|
||||||
vcodec = remove_start(m['vcodec'], 'ff')
|
vcodec = remove_start(m['vcodec'], 'ff')
|
||||||
acodec = remove_start(m['acodec'], 'ff')
|
acodec = remove_start(m['acodec'], 'ff')
|
||||||
tbr = (int(m['vbr']) + int(m['abr'])) // 1000
|
vbr = int_or_none(m.get('vbr') or m.get('vbitrate'), 1000)
|
||||||
|
abr = int_or_none(m.get('abr') or m.get('abitrate'), 1000)
|
||||||
|
tbr = vbr + abr if vbr and abr else None
|
||||||
format_id = '%s-%sk-%s' % (vcodec, tbr, m['w'])
|
format_id = '%s-%sk-%s' % (vcodec, tbr, m['w'])
|
||||||
formats.append({
|
formats.append({
|
||||||
'format_id': format_id,
|
'format_id': format_id,
|
||||||
'url': m['link'],
|
'url': m['link'],
|
||||||
'vcodec': vcodec,
|
'vcodec': vcodec,
|
||||||
'acodec': acodec,
|
'acodec': acodec,
|
||||||
'abr': int(m['abr']) // 1000,
|
'abr': abr,
|
||||||
'vbr': int(m['vbr']) // 1000,
|
'vbr': vbr,
|
||||||
'tbr': tbr,
|
'tbr': tbr,
|
||||||
'width': int(m['w']),
|
'width': int_or_none(m.get('w')),
|
||||||
'height': int(m['h']),
|
'height': int_or_none(m.get('h')),
|
||||||
})
|
})
|
||||||
|
|
||||||
self._sort_formats(formats)
|
self._sort_formats(formats)
|
||||||
|
@ -4,13 +4,17 @@ import re
|
|||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from .subtitles import SubtitlesInfoExtractor
|
from .subtitles import SubtitlesInfoExtractor
|
||||||
from ..utils import (
|
|
||||||
compat_urllib_request,
|
from ..compat import (
|
||||||
unescapeHTML,
|
|
||||||
parse_iso8601,
|
|
||||||
compat_urlparse,
|
|
||||||
clean_html,
|
|
||||||
compat_str,
|
compat_str,
|
||||||
|
compat_urllib_request,
|
||||||
|
compat_urlparse,
|
||||||
|
)
|
||||||
|
from ..utils import (
|
||||||
|
clean_html,
|
||||||
|
int_or_none,
|
||||||
|
parse_iso8601,
|
||||||
|
unescapeHTML,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
@ -78,7 +82,25 @@ class BlipTVIE(SubtitlesInfoExtractor):
|
|||||||
'uploader': 'NostalgiaCritic',
|
'uploader': 'NostalgiaCritic',
|
||||||
'uploader_id': '246467',
|
'uploader_id': '246467',
|
||||||
}
|
}
|
||||||
}
|
},
|
||||||
|
{
|
||||||
|
# https://github.com/rg3/youtube-dl/pull/4404
|
||||||
|
'note': 'Audio only',
|
||||||
|
'url': 'http://blip.tv/hilarios-productions/weekly-manga-recap-kingdom-7119982',
|
||||||
|
'md5': '76c0a56f24e769ceaab21fbb6416a351',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '7103299',
|
||||||
|
'ext': 'flv',
|
||||||
|
'title': 'Weekly Manga Recap: Kingdom',
|
||||||
|
'description': 'And then Shin breaks the enemy line, and he's all like HWAH! And then he slices a guy and it's all like FWASHING! And... it's really hard to describe the best parts of this series without breaking down into sound effects, okay?',
|
||||||
|
'timestamp': 1417660321,
|
||||||
|
'upload_date': '20141204',
|
||||||
|
'uploader': 'The Rollo T',
|
||||||
|
'uploader_id': '407429',
|
||||||
|
'duration': 7251,
|
||||||
|
'vcodec': 'none',
|
||||||
|
}
|
||||||
|
},
|
||||||
]
|
]
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
@ -145,11 +167,11 @@ class BlipTVIE(SubtitlesInfoExtractor):
|
|||||||
'url': real_url,
|
'url': real_url,
|
||||||
'format_id': role,
|
'format_id': role,
|
||||||
'format_note': media_type,
|
'format_note': media_type,
|
||||||
'vcodec': media_content.get(blip('vcodec')),
|
'vcodec': media_content.get(blip('vcodec')) or 'none',
|
||||||
'acodec': media_content.get(blip('acodec')),
|
'acodec': media_content.get(blip('acodec')),
|
||||||
'filesize': media_content.get('filesize'),
|
'filesize': media_content.get('filesize'),
|
||||||
'width': int(media_content.get('width')),
|
'width': int_or_none(media_content.get('width')),
|
||||||
'height': int(media_content.get('height')),
|
'height': int_or_none(media_content.get('height')),
|
||||||
})
|
})
|
||||||
self._sort_formats(formats)
|
self._sort_formats(formats)
|
||||||
|
|
||||||
@ -177,7 +199,7 @@ class BlipTVIE(SubtitlesInfoExtractor):
|
|||||||
# For some weird reason, blip.tv serves a video instead of subtitles
|
# For some weird reason, blip.tv serves a video instead of subtitles
|
||||||
# when we request with a common UA
|
# when we request with a common UA
|
||||||
req = compat_urllib_request.Request(url)
|
req = compat_urllib_request.Request(url)
|
||||||
req.add_header('Youtubedl-user-agent', 'youtube-dl')
|
req.add_header('User-Agent', 'youtube-dl')
|
||||||
return self._download_webpage(req, None, note=False)
|
return self._download_webpage(req, None, note=False)
|
||||||
|
|
||||||
|
|
||||||
|
@ -14,7 +14,6 @@ class BreakIE(InfoExtractor):
|
|||||||
_VALID_URL = r'http://(?:www\.)?break\.com/video/(?:[^/]+/)*.+-(?P<id>\d+)'
|
_VALID_URL = r'http://(?:www\.)?break\.com/video/(?:[^/]+/)*.+-(?P<id>\d+)'
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'http://www.break.com/video/when-girls-act-like-guys-2468056',
|
'url': 'http://www.break.com/video/when-girls-act-like-guys-2468056',
|
||||||
'md5': '33aa4ff477ecd124d18d7b5d23b87ce5',
|
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '2468056',
|
'id': '2468056',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
|
@ -6,25 +6,26 @@ import json
|
|||||||
import xml.etree.ElementTree
|
import xml.etree.ElementTree
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from ..utils import (
|
from ..compat import (
|
||||||
compat_urllib_parse,
|
|
||||||
find_xpath_attr,
|
|
||||||
fix_xml_ampersands,
|
|
||||||
compat_urlparse,
|
|
||||||
compat_str,
|
|
||||||
compat_urllib_request,
|
|
||||||
compat_parse_qs,
|
compat_parse_qs,
|
||||||
|
compat_str,
|
||||||
|
compat_urllib_parse,
|
||||||
compat_urllib_parse_urlparse,
|
compat_urllib_parse_urlparse,
|
||||||
|
compat_urllib_request,
|
||||||
|
compat_urlparse,
|
||||||
|
)
|
||||||
|
from ..utils import (
|
||||||
determine_ext,
|
determine_ext,
|
||||||
ExtractorError,
|
ExtractorError,
|
||||||
unsmuggle_url,
|
find_xpath_attr,
|
||||||
|
fix_xml_ampersands,
|
||||||
unescapeHTML,
|
unescapeHTML,
|
||||||
|
unsmuggle_url,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
class BrightcoveIE(InfoExtractor):
|
class BrightcoveIE(InfoExtractor):
|
||||||
_VALID_URL = r'https?://.*brightcove\.com/(services|viewer).*?\?(?P<query>.*)'
|
_VALID_URL = r'(?:https?://.*brightcove\.com/(services|viewer).*?\?|brightcove:)(?P<query>.*)'
|
||||||
_FEDERATED_URL_TEMPLATE = 'http://c.brightcove.com/services/viewer/htmlFederated?%s'
|
_FEDERATED_URL_TEMPLATE = 'http://c.brightcove.com/services/viewer/htmlFederated?%s'
|
||||||
|
|
||||||
_TESTS = [
|
_TESTS = [
|
||||||
@ -94,6 +95,7 @@ class BrightcoveIE(InfoExtractor):
|
|||||||
'url': 'http://c.brightcove.com/services/viewer/htmlFederated?playerID=3550052898001&playerKey=AQ%7E%7E%2CAAABmA9XpXk%7E%2C-Kp7jNgisre1fG5OdqpAFUTcs0lP_ZoL',
|
'url': 'http://c.brightcove.com/services/viewer/htmlFederated?playerID=3550052898001&playerKey=AQ%7E%7E%2CAAABmA9XpXk%7E%2C-Kp7jNgisre1fG5OdqpAFUTcs0lP_ZoL',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'title': 'Sealife',
|
'title': 'Sealife',
|
||||||
|
'id': '3550319591001',
|
||||||
},
|
},
|
||||||
'playlist_mincount': 7,
|
'playlist_mincount': 7,
|
||||||
},
|
},
|
||||||
@ -107,7 +109,7 @@ class BrightcoveIE(InfoExtractor):
|
|||||||
"""
|
"""
|
||||||
|
|
||||||
# Fix up some stupid HTML, see https://github.com/rg3/youtube-dl/issues/1553
|
# Fix up some stupid HTML, see https://github.com/rg3/youtube-dl/issues/1553
|
||||||
object_str = re.sub(r'(<param name="[^"]+" value="[^"]+")>',
|
object_str = re.sub(r'(<param(?:\s+[a-zA-Z0-9_]+="[^"]*")*)>',
|
||||||
lambda m: m.group(1) + '/>', object_str)
|
lambda m: m.group(1) + '/>', object_str)
|
||||||
# Fix up some stupid XML, see https://github.com/rg3/youtube-dl/issues/1608
|
# Fix up some stupid XML, see https://github.com/rg3/youtube-dl/issues/1608
|
||||||
object_str = object_str.replace('<--', '<!--')
|
object_str = object_str.replace('<--', '<!--')
|
||||||
@ -246,7 +248,7 @@ class BrightcoveIE(InfoExtractor):
|
|||||||
playlist_info = json_data['videoList']
|
playlist_info = json_data['videoList']
|
||||||
videos = [self._extract_video_info(video_info) for video_info in playlist_info['mediaCollectionDTO']['videoDTOs']]
|
videos = [self._extract_video_info(video_info) for video_info in playlist_info['mediaCollectionDTO']['videoDTOs']]
|
||||||
|
|
||||||
return self.playlist_result(videos, playlist_id=playlist_info['id'],
|
return self.playlist_result(videos, playlist_id='%s' % playlist_info['id'],
|
||||||
playlist_title=playlist_info['mediaCollectionDTO']['displayName'])
|
playlist_title=playlist_info['mediaCollectionDTO']['displayName'])
|
||||||
|
|
||||||
def _extract_video_info(self, video_info):
|
def _extract_video_info(self, video_info):
|
||||||
@ -265,6 +267,7 @@ class BrightcoveIE(InfoExtractor):
|
|||||||
url = rend['defaultURL']
|
url = rend['defaultURL']
|
||||||
if not url:
|
if not url:
|
||||||
continue
|
continue
|
||||||
|
ext = None
|
||||||
if rend['remote']:
|
if rend['remote']:
|
||||||
url_comp = compat_urllib_parse_urlparse(url)
|
url_comp = compat_urllib_parse_urlparse(url)
|
||||||
if url_comp.path.endswith('.m3u8'):
|
if url_comp.path.endswith('.m3u8'):
|
||||||
@ -276,7 +279,7 @@ class BrightcoveIE(InfoExtractor):
|
|||||||
# akamaihd.net, but they don't use f4m manifests
|
# akamaihd.net, but they don't use f4m manifests
|
||||||
url = url.replace('control/', '') + '?&v=3.3.0&fp=13&r=FEEFJ&g=RTSJIMBMPFPB'
|
url = url.replace('control/', '') + '?&v=3.3.0&fp=13&r=FEEFJ&g=RTSJIMBMPFPB'
|
||||||
ext = 'flv'
|
ext = 'flv'
|
||||||
else:
|
if ext is None:
|
||||||
ext = determine_ext(url)
|
ext = determine_ext(url)
|
||||||
size = rend.get('size')
|
size = rend.get('size')
|
||||||
formats.append({
|
formats.append({
|
||||||
|
@ -33,7 +33,8 @@ class BuzzFeedIE(InfoExtractor):
|
|||||||
'skip_download': True, # Got enough YouTube download tests
|
'skip_download': True, # Got enough YouTube download tests
|
||||||
},
|
},
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'description': 'Munchkin the Teddy Bear is back !',
|
'id': 'look-at-this-cute-dog-omg',
|
||||||
|
'description': 're:Munchkin the Teddy Bear is back ?!',
|
||||||
'title': 'You Need To Stop What You\'re Doing And Watching This Dog Walk On A Treadmill',
|
'title': 'You Need To Stop What You\'re Doing And Watching This Dog Walk On A Treadmill',
|
||||||
},
|
},
|
||||||
'playlist': [{
|
'playlist': [{
|
||||||
@ -42,9 +43,9 @@ class BuzzFeedIE(InfoExtractor):
|
|||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'upload_date': '20141124',
|
'upload_date': '20141124',
|
||||||
'uploader_id': 'CindysMunchkin',
|
'uploader_id': 'CindysMunchkin',
|
||||||
'description': '© 2014 Munchkin the Shih Tzu\nAll rights reserved\nFacebook: http://facebook.com/MunchkintheShihTzu',
|
'description': 're:© 2014 Munchkin the',
|
||||||
'uploader': 'Munchkin the Shih Tzu',
|
'uploader': 're:^Munchkin the',
|
||||||
'title': 'Munchkin the Teddy Bear gets her exercise',
|
'title': 're:Munchkin the Teddy Bear gets her exercise',
|
||||||
},
|
},
|
||||||
}]
|
}]
|
||||||
}]
|
}]
|
||||||
|
153
youtube_dl/extractor/camdemy.py
Normal file
153
youtube_dl/extractor/camdemy.py
Normal file
@ -0,0 +1,153 @@
|
|||||||
|
# coding: utf-8
|
||||||
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
import datetime
|
||||||
|
import re
|
||||||
|
|
||||||
|
from .common import InfoExtractor
|
||||||
|
from ..compat import (
|
||||||
|
compat_urllib_parse,
|
||||||
|
compat_urlparse,
|
||||||
|
)
|
||||||
|
from ..utils import (
|
||||||
|
parse_iso8601,
|
||||||
|
str_to_int,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
class CamdemyIE(InfoExtractor):
|
||||||
|
_VALID_URL = r'http://(?:www\.)?camdemy\.com/media/(?P<id>\d+)'
|
||||||
|
_TESTS = [{
|
||||||
|
# single file
|
||||||
|
'url': 'http://www.camdemy.com/media/5181/',
|
||||||
|
'md5': '5a5562b6a98b37873119102e052e311b',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '5181',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'Ch1-1 Introduction, Signals (02-23-2012)',
|
||||||
|
'thumbnail': 're:^https?://.*\.jpg$',
|
||||||
|
'description': '',
|
||||||
|
'creator': 'ss11spring',
|
||||||
|
'upload_date': '20130114',
|
||||||
|
'timestamp': 1358154556,
|
||||||
|
'view_count': int,
|
||||||
|
}
|
||||||
|
}, {
|
||||||
|
# With non-empty description
|
||||||
|
'url': 'http://www.camdemy.com/media/13885',
|
||||||
|
'md5': '4576a3bb2581f86c61044822adbd1249',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '13885',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'EverCam + Camdemy QuickStart',
|
||||||
|
'thumbnail': 're:^https?://.*\.jpg$',
|
||||||
|
'description': 'md5:050b62f71ed62928f8a35f1a41e186c9',
|
||||||
|
'creator': 'evercam',
|
||||||
|
'upload_date': '20140620',
|
||||||
|
'timestamp': 1403271569,
|
||||||
|
}
|
||||||
|
}, {
|
||||||
|
# External source
|
||||||
|
'url': 'http://www.camdemy.com/media/14842',
|
||||||
|
'md5': '50e1c3c3aa233d3d7b7daa2fa10b1cf7',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '2vsYQzNIsJo',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'upload_date': '20130211',
|
||||||
|
'uploader': 'Hun Kim',
|
||||||
|
'description': 'Excel 2013 Tutorial for Beginners - How to add Password Protection',
|
||||||
|
'uploader_id': 'hunkimtutorials',
|
||||||
|
'title': 'Excel 2013 Tutorial - How to add Password Protection',
|
||||||
|
}
|
||||||
|
}]
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
video_id = self._match_id(url)
|
||||||
|
page = self._download_webpage(url, video_id)
|
||||||
|
|
||||||
|
src_from = self._html_search_regex(
|
||||||
|
r"<div class='srcFrom'>Source: <a title='([^']+)'", page,
|
||||||
|
'external source', default=None)
|
||||||
|
if src_from:
|
||||||
|
return self.url_result(src_from)
|
||||||
|
|
||||||
|
oembed_obj = self._download_json(
|
||||||
|
'http://www.camdemy.com/oembed/?format=json&url=' + url, video_id)
|
||||||
|
|
||||||
|
thumb_url = oembed_obj['thumbnail_url']
|
||||||
|
video_folder = compat_urlparse.urljoin(thumb_url, 'video/')
|
||||||
|
file_list_doc = self._download_xml(
|
||||||
|
compat_urlparse.urljoin(video_folder, 'fileList.xml'),
|
||||||
|
video_id, 'Filelist XML')
|
||||||
|
file_name = file_list_doc.find('./video/item/fileName').text
|
||||||
|
video_url = compat_urlparse.urljoin(video_folder, file_name)
|
||||||
|
|
||||||
|
timestamp = parse_iso8601(self._html_search_regex(
|
||||||
|
r"<div class='title'>Posted\s*:</div>\s*<div class='value'>([^<>]+)<",
|
||||||
|
page, 'creation time', fatal=False),
|
||||||
|
delimiter=' ', timezone=datetime.timedelta(hours=8))
|
||||||
|
view_count = str_to_int(self._html_search_regex(
|
||||||
|
r"<div class='title'>Views\s*:</div>\s*<div class='value'>([^<>]+)<",
|
||||||
|
page, 'view count', fatal=False))
|
||||||
|
|
||||||
|
return {
|
||||||
|
'id': video_id,
|
||||||
|
'url': video_url,
|
||||||
|
'title': oembed_obj['title'],
|
||||||
|
'thumbnail': thumb_url,
|
||||||
|
'description': self._html_search_meta('description', page),
|
||||||
|
'creator': oembed_obj['author_name'],
|
||||||
|
'duration': oembed_obj['duration'],
|
||||||
|
'timestamp': timestamp,
|
||||||
|
'view_count': view_count,
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
class CamdemyFolderIE(InfoExtractor):
|
||||||
|
_VALID_URL = r'http://www.camdemy.com/folder/(?P<id>\d+)'
|
||||||
|
_TESTS = [{
|
||||||
|
# links with trailing slash
|
||||||
|
'url': 'http://www.camdemy.com/folder/450',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '450',
|
||||||
|
'title': '信號與系統 2012 & 2011 (Signals and Systems)',
|
||||||
|
},
|
||||||
|
'playlist_mincount': 145
|
||||||
|
}, {
|
||||||
|
# links without trailing slash
|
||||||
|
# and multi-page
|
||||||
|
'url': 'http://www.camdemy.com/folder/853',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '853',
|
||||||
|
'title': '科學計算 - 使用 Matlab'
|
||||||
|
},
|
||||||
|
'playlist_mincount': 20
|
||||||
|
}, {
|
||||||
|
# with displayMode parameter. For testing the codes to add parameters
|
||||||
|
'url': 'http://www.camdemy.com/folder/853/?displayMode=defaultOrderByOrg',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '853',
|
||||||
|
'title': '科學計算 - 使用 Matlab'
|
||||||
|
},
|
||||||
|
'playlist_mincount': 20
|
||||||
|
}]
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
folder_id = self._match_id(url)
|
||||||
|
|
||||||
|
# Add displayMode=list so that all links are displayed in a single page
|
||||||
|
parsed_url = list(compat_urlparse.urlparse(url))
|
||||||
|
query = dict(compat_urlparse.parse_qsl(parsed_url[4]))
|
||||||
|
query.update({'displayMode': 'list'})
|
||||||
|
parsed_url[4] = compat_urllib_parse.urlencode(query)
|
||||||
|
final_url = compat_urlparse.urlunparse(parsed_url)
|
||||||
|
|
||||||
|
page = self._download_webpage(final_url, folder_id)
|
||||||
|
matches = re.findall(r"href='(/media/\d+/?)'", page)
|
||||||
|
|
||||||
|
entries = [self.url_result('http://www.camdemy.com' + media_path)
|
||||||
|
for media_path in matches]
|
||||||
|
|
||||||
|
folder_title = self._html_search_meta('keywords', page)
|
||||||
|
|
||||||
|
return self.playlist_result(entries, folder_id, folder_title)
|
@ -5,6 +5,8 @@ import re
|
|||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
|
ExtractorError,
|
||||||
|
HEADRequest,
|
||||||
unified_strdate,
|
unified_strdate,
|
||||||
url_basename,
|
url_basename,
|
||||||
qualities,
|
qualities,
|
||||||
@ -13,12 +15,13 @@ from ..utils import (
|
|||||||
|
|
||||||
class CanalplusIE(InfoExtractor):
|
class CanalplusIE(InfoExtractor):
|
||||||
IE_DESC = 'canalplus.fr, piwiplus.fr and d8.tv'
|
IE_DESC = 'canalplus.fr, piwiplus.fr and d8.tv'
|
||||||
_VALID_URL = r'https?://(?:www\.(?P<site>canalplus\.fr|piwiplus\.fr|d8\.tv)/.*?/(?P<path>.*)|player\.canalplus\.fr/#/(?P<id>[0-9]+))'
|
_VALID_URL = r'https?://(?:www\.(?P<site>canalplus\.fr|piwiplus\.fr|d8\.tv|itele\.fr)/.*?/(?P<path>.*)|player\.canalplus\.fr/#/(?P<id>[0-9]+))'
|
||||||
_VIDEO_INFO_TEMPLATE = 'http://service.canal-plus.com/video/rest/getVideosLiees/%s/%s'
|
_VIDEO_INFO_TEMPLATE = 'http://service.canal-plus.com/video/rest/getVideosLiees/%s/%s'
|
||||||
_SITE_ID_MAP = {
|
_SITE_ID_MAP = {
|
||||||
'canalplus.fr': 'cplus',
|
'canalplus.fr': 'cplus',
|
||||||
'piwiplus.fr': 'teletoon',
|
'piwiplus.fr': 'teletoon',
|
||||||
'd8.tv': 'd8',
|
'd8.tv': 'd8',
|
||||||
|
'itele.fr': 'itele',
|
||||||
}
|
}
|
||||||
|
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
@ -51,6 +54,16 @@ class CanalplusIE(InfoExtractor):
|
|||||||
'upload_date': '20131108',
|
'upload_date': '20131108',
|
||||||
},
|
},
|
||||||
'skip': 'videos get deleted after a while',
|
'skip': 'videos get deleted after a while',
|
||||||
|
}, {
|
||||||
|
'url': 'http://www.itele.fr/france/video/aubervilliers-un-lycee-en-colere-111559',
|
||||||
|
'md5': '65aa83ad62fe107ce29e564bb8712580',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '1213714',
|
||||||
|
'ext': 'flv',
|
||||||
|
'title': 'Aubervilliers : un lycée en colère - Le 11/02/2015 à 06h45',
|
||||||
|
'description': 'md5:8216206ec53426ea6321321f3b3c16db',
|
||||||
|
'upload_date': '20150211',
|
||||||
|
},
|
||||||
}]
|
}]
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
@ -76,6 +89,16 @@ class CanalplusIE(InfoExtractor):
|
|||||||
|
|
||||||
preference = qualities(['MOBILE', 'BAS_DEBIT', 'HAUT_DEBIT', 'HD', 'HLS', 'HDS'])
|
preference = qualities(['MOBILE', 'BAS_DEBIT', 'HAUT_DEBIT', 'HD', 'HLS', 'HDS'])
|
||||||
|
|
||||||
|
fmt_url = next(iter(media.find('VIDEOS'))).text
|
||||||
|
if '/geo' in fmt_url.lower():
|
||||||
|
response = self._request_webpage(
|
||||||
|
HEADRequest(fmt_url), video_id,
|
||||||
|
'Checking if the video is georestricted')
|
||||||
|
if '/blocage' in response.geturl():
|
||||||
|
raise ExtractorError(
|
||||||
|
'The video is not available in your country',
|
||||||
|
expected=True)
|
||||||
|
|
||||||
formats = []
|
formats = []
|
||||||
for fmt in media.find('VIDEOS'):
|
for fmt in media.find('VIDEOS'):
|
||||||
format_url = fmt.text
|
format_url = fmt.text
|
||||||
|
@ -1,7 +1,5 @@
|
|||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
import re
|
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
|
|
||||||
|
|
||||||
@ -39,8 +37,7 @@ class CBSIE(InfoExtractor):
|
|||||||
}]
|
}]
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
mobj = re.match(self._VALID_URL, url)
|
video_id = self._match_id(url)
|
||||||
video_id = mobj.group('id')
|
|
||||||
webpage = self._download_webpage(url, video_id)
|
webpage = self._download_webpage(url, video_id)
|
||||||
real_id = self._search_regex(
|
real_id = self._search_regex(
|
||||||
r"video\.settings\.pid\s*=\s*'([^']+)';",
|
r"video\.settings\.pid\s*=\s*'([^']+)';",
|
||||||
|
30
youtube_dl/extractor/cbssports.py
Normal file
30
youtube_dl/extractor/cbssports.py
Normal file
@ -0,0 +1,30 @@
|
|||||||
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
import re
|
||||||
|
|
||||||
|
from .common import InfoExtractor
|
||||||
|
|
||||||
|
|
||||||
|
class CBSSportsIE(InfoExtractor):
|
||||||
|
_VALID_URL = r'http://www\.cbssports\.com/video/player/(?P<section>[^/]+)/(?P<id>[^/]+)'
|
||||||
|
|
||||||
|
_TEST = {
|
||||||
|
'url': 'http://www.cbssports.com/video/player/tennis/318462531970/0/us-open-flashbacks-1990s',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '_d5_GbO8p1sT',
|
||||||
|
'ext': 'flv',
|
||||||
|
'title': 'US Open flashbacks: 1990s',
|
||||||
|
'description': 'Bill Macatee relives the best moments in US Open history from the 1990s.',
|
||||||
|
},
|
||||||
|
}
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
mobj = re.match(self._VALID_URL, url)
|
||||||
|
section = mobj.group('section')
|
||||||
|
video_id = mobj.group('id')
|
||||||
|
all_videos = self._download_json(
|
||||||
|
'http://www.cbssports.com/data/video/player/getVideos/%s?as=json' % section,
|
||||||
|
video_id)
|
||||||
|
# The json file contains the info of all the videos in the section
|
||||||
|
video_info = next(v for v in all_videos if v['pcid'] == video_id)
|
||||||
|
return self.url_result('theplatform:%s' % video_info['pid'], 'ThePlatform')
|
99
youtube_dl/extractor/ccc.py
Normal file
99
youtube_dl/extractor/ccc.py
Normal file
@ -0,0 +1,99 @@
|
|||||||
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
import re
|
||||||
|
|
||||||
|
from .common import InfoExtractor
|
||||||
|
from ..utils import (
|
||||||
|
int_or_none,
|
||||||
|
qualities,
|
||||||
|
unified_strdate,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
class CCCIE(InfoExtractor):
|
||||||
|
IE_NAME = 'media.ccc.de'
|
||||||
|
_VALID_URL = r'https?://(?:www\.)?media\.ccc\.de/[^?#]+/[^?#/]*?_(?P<id>[0-9]{8,})._[^?#/]*\.html'
|
||||||
|
|
||||||
|
_TEST = {
|
||||||
|
'url': 'http://media.ccc.de/browse/congress/2013/30C3_-_5443_-_en_-_saal_g_-_201312281830_-_introduction_to_processor_design_-_byterazor.html#video',
|
||||||
|
'md5': '205a365d0d57c0b1e43a12c9ffe8f9be',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '20131228183',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'Introduction to Processor Design',
|
||||||
|
'description': 'md5:5ddbf8c734800267f2cee4eab187bc1b',
|
||||||
|
'thumbnail': 're:^https?://.*\.jpg$',
|
||||||
|
'view_count': int,
|
||||||
|
'upload_date': '20131229',
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
video_id = self._match_id(url)
|
||||||
|
webpage = self._download_webpage(url, video_id)
|
||||||
|
|
||||||
|
if self._downloader.params.get('prefer_free_formats'):
|
||||||
|
preference = qualities(['mp3', 'opus', 'mp4-lq', 'webm-lq', 'h264-sd', 'mp4-sd', 'webm-sd', 'mp4', 'webm', 'mp4-hd', 'h264-hd', 'webm-hd'])
|
||||||
|
else:
|
||||||
|
preference = qualities(['opus', 'mp3', 'webm-lq', 'mp4-lq', 'webm-sd', 'h264-sd', 'mp4-sd', 'webm', 'mp4', 'webm-hd', 'mp4-hd', 'h264-hd'])
|
||||||
|
|
||||||
|
title = self._html_search_regex(
|
||||||
|
r'(?s)<h1>(.*?)</h1>', webpage, 'title')
|
||||||
|
description = self._html_search_regex(
|
||||||
|
r"(?s)<p class='description'>(.*?)</p>",
|
||||||
|
webpage, 'description', fatal=False)
|
||||||
|
upload_date = unified_strdate(self._html_search_regex(
|
||||||
|
r"(?s)<span class='[^']*fa-calendar-o'></span>(.*?)</li>",
|
||||||
|
webpage, 'upload date', fatal=False))
|
||||||
|
view_count = int_or_none(self._html_search_regex(
|
||||||
|
r"(?s)<span class='[^']*fa-eye'></span>(.*?)</li>",
|
||||||
|
webpage, 'view count', fatal=False))
|
||||||
|
|
||||||
|
matches = re.finditer(r'''(?xs)
|
||||||
|
<(?:span|div)\s+class='label\s+filetype'>(?P<format>.*?)</(?:span|div)>\s*
|
||||||
|
<a\s+href='(?P<http_url>[^']+)'>\s*
|
||||||
|
(?:
|
||||||
|
.*?
|
||||||
|
<a\s+href='(?P<torrent_url>[^']+\.torrent)'
|
||||||
|
)?''', webpage)
|
||||||
|
formats = []
|
||||||
|
for m in matches:
|
||||||
|
format = m.group('format')
|
||||||
|
format_id = self._search_regex(
|
||||||
|
r'.*/([a-z0-9_-]+)/[^/]*$',
|
||||||
|
m.group('http_url'), 'format id', default=None)
|
||||||
|
vcodec = 'h264' if 'h264' in format_id else (
|
||||||
|
'none' if format_id in ('mp3', 'opus') else None
|
||||||
|
)
|
||||||
|
formats.append({
|
||||||
|
'format_id': format_id,
|
||||||
|
'format': format,
|
||||||
|
'url': m.group('http_url'),
|
||||||
|
'vcodec': vcodec,
|
||||||
|
'preference': preference(format_id),
|
||||||
|
})
|
||||||
|
|
||||||
|
if m.group('torrent_url'):
|
||||||
|
formats.append({
|
||||||
|
'format_id': 'torrent-%s' % (format if format_id is None else format_id),
|
||||||
|
'format': '%s (torrent)' % format,
|
||||||
|
'proto': 'torrent',
|
||||||
|
'format_note': '(unsupported; will just download the .torrent file)',
|
||||||
|
'vcodec': vcodec,
|
||||||
|
'preference': -100 + preference(format_id),
|
||||||
|
'url': m.group('torrent_url'),
|
||||||
|
})
|
||||||
|
self._sort_formats(formats)
|
||||||
|
|
||||||
|
thumbnail = self._html_search_regex(
|
||||||
|
r"<video.*?poster='([^']+)'", webpage, 'thumbnail', fatal=False)
|
||||||
|
|
||||||
|
return {
|
||||||
|
'id': video_id,
|
||||||
|
'title': title,
|
||||||
|
'description': description,
|
||||||
|
'thumbnail': thumbnail,
|
||||||
|
'view_count': view_count,
|
||||||
|
'upload_date': upload_date,
|
||||||
|
'formats': formats,
|
||||||
|
}
|
@ -3,55 +3,50 @@ from __future__ import unicode_literals
|
|||||||
|
|
||||||
import re
|
import re
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .subtitles import SubtitlesInfoExtractor
|
||||||
from ..utils import (
|
from ..compat import (
|
||||||
compat_urllib_request,
|
compat_urllib_request,
|
||||||
compat_urllib_parse,
|
compat_urllib_parse,
|
||||||
compat_urllib_parse_urlparse,
|
compat_urllib_parse_urlparse,
|
||||||
|
)
|
||||||
|
from ..utils import (
|
||||||
ExtractorError,
|
ExtractorError,
|
||||||
|
float_or_none,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
class CeskaTelevizeIE(InfoExtractor):
|
class CeskaTelevizeIE(SubtitlesInfoExtractor):
|
||||||
_VALID_URL = r'https?://www\.ceskatelevize\.cz/(porady|ivysilani)/(.+/)?(?P<id>[^?#]+)'
|
_VALID_URL = r'https?://www\.ceskatelevize\.cz/(porady|ivysilani)/(.+/)?(?P<id>[^?#]+)'
|
||||||
|
|
||||||
_TESTS = [
|
_TESTS = [
|
||||||
{
|
{
|
||||||
'url': 'http://www.ceskatelevize.cz/ivysilani/10532695142-prvni-republika/213512120230004-spanelska-chripka',
|
'url': 'http://www.ceskatelevize.cz/ivysilani/ivysilani/10441294653-hyde-park-civilizace/214411058091220',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '213512120230004',
|
'id': '214411058091220',
|
||||||
'ext': 'flv',
|
'ext': 'mp4',
|
||||||
'title': 'První republika: Španělská chřipka',
|
'title': 'Hyde Park Civilizace',
|
||||||
'duration': 3107.4,
|
'description': 'Věda a současná civilizace. Interaktivní pořad - prostor pro vaše otázky a komentáře',
|
||||||
|
'thumbnail': 're:^https?://.*\.jpg',
|
||||||
|
'duration': 3350,
|
||||||
},
|
},
|
||||||
'params': {
|
'params': {
|
||||||
'skip_download': True, # requires rtmpdump
|
# m3u8 download
|
||||||
|
'skip_download': True,
|
||||||
},
|
},
|
||||||
'skip': 'Works only from Czech Republic.',
|
|
||||||
},
|
|
||||||
{
|
|
||||||
'url': 'http://www.ceskatelevize.cz/ivysilani/1030584952-tsatsiki-maminka-a-policajt',
|
|
||||||
'info_dict': {
|
|
||||||
'id': '20138143440',
|
|
||||||
'ext': 'flv',
|
|
||||||
'title': 'Tsatsiki, maminka a policajt',
|
|
||||||
'duration': 6754.1,
|
|
||||||
},
|
|
||||||
'params': {
|
|
||||||
'skip_download': True, # requires rtmpdump
|
|
||||||
},
|
|
||||||
'skip': 'Works only from Czech Republic.',
|
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
'url': 'http://www.ceskatelevize.cz/ivysilani/10532695142-prvni-republika/bonus/14716-zpevacka-z-duparny-bobina',
|
'url': 'http://www.ceskatelevize.cz/ivysilani/10532695142-prvni-republika/bonus/14716-zpevacka-z-duparny-bobina',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '14716',
|
'id': '14716',
|
||||||
'ext': 'flv',
|
'ext': 'mp4',
|
||||||
'title': 'První republika: Zpěvačka z Dupárny Bobina',
|
'title': 'První republika: Zpěvačka z Dupárny Bobina',
|
||||||
'duration': 90,
|
'description': 'Sága mapující atmosféru první republiky od r. 1918 do r. 1945.',
|
||||||
|
'thumbnail': 're:^https?://.*\.jpg',
|
||||||
|
'duration': 88.4,
|
||||||
},
|
},
|
||||||
'params': {
|
'params': {
|
||||||
'skip_download': True, # requires rtmpdump
|
# m3u8 download
|
||||||
|
'skip_download': True,
|
||||||
},
|
},
|
||||||
},
|
},
|
||||||
]
|
]
|
||||||
@ -78,8 +73,9 @@ class CeskaTelevizeIE(InfoExtractor):
|
|||||||
'requestSource': 'iVysilani',
|
'requestSource': 'iVysilani',
|
||||||
}
|
}
|
||||||
|
|
||||||
req = compat_urllib_request.Request('http://www.ceskatelevize.cz/ivysilani/ajax/get-playlist-url',
|
req = compat_urllib_request.Request(
|
||||||
data=compat_urllib_parse.urlencode(data))
|
'http://www.ceskatelevize.cz/ivysilani/ajax/get-client-playlist',
|
||||||
|
data=compat_urllib_parse.urlencode(data))
|
||||||
|
|
||||||
req.add_header('Content-type', 'application/x-www-form-urlencoded')
|
req.add_header('Content-type', 'application/x-www-form-urlencoded')
|
||||||
req.add_header('x-addr', '127.0.0.1')
|
req.add_header('x-addr', '127.0.0.1')
|
||||||
@ -88,39 +84,72 @@ class CeskaTelevizeIE(InfoExtractor):
|
|||||||
|
|
||||||
playlistpage = self._download_json(req, video_id)
|
playlistpage = self._download_json(req, video_id)
|
||||||
|
|
||||||
req = compat_urllib_request.Request(compat_urllib_parse.unquote(playlistpage['url']))
|
playlist_url = playlistpage['url']
|
||||||
|
if playlist_url == 'error_region':
|
||||||
|
raise ExtractorError(NOT_AVAILABLE_STRING, expected=True)
|
||||||
|
|
||||||
|
req = compat_urllib_request.Request(compat_urllib_parse.unquote(playlist_url))
|
||||||
req.add_header('Referer', url)
|
req.add_header('Referer', url)
|
||||||
|
|
||||||
playlist = self._download_xml(req, video_id)
|
playlist = self._download_json(req, video_id)
|
||||||
|
|
||||||
|
item = playlist['playlist'][0]
|
||||||
formats = []
|
formats = []
|
||||||
for i in playlist.find('smilRoot/body'):
|
for format_id, stream_url in item['streamUrls'].items():
|
||||||
if 'AD' not in i.attrib['id']:
|
formats.extend(self._extract_m3u8_formats(stream_url, video_id, 'mp4'))
|
||||||
base_url = i.attrib['base']
|
|
||||||
parsedurl = compat_urllib_parse_urlparse(base_url)
|
|
||||||
duration = i.attrib['duration']
|
|
||||||
|
|
||||||
for video in i.findall('video'):
|
|
||||||
if video.attrib['label'] != 'AD':
|
|
||||||
format_id = video.attrib['label']
|
|
||||||
play_path = video.attrib['src']
|
|
||||||
vbr = int(video.attrib['system-bitrate'])
|
|
||||||
|
|
||||||
formats.append({
|
|
||||||
'format_id': format_id,
|
|
||||||
'url': base_url,
|
|
||||||
'vbr': vbr,
|
|
||||||
'play_path': play_path,
|
|
||||||
'app': parsedurl.path[1:] + '?' + parsedurl.query,
|
|
||||||
'rtmp_live': True,
|
|
||||||
'ext': 'flv',
|
|
||||||
})
|
|
||||||
|
|
||||||
self._sort_formats(formats)
|
self._sort_formats(formats)
|
||||||
|
|
||||||
|
title = self._og_search_title(webpage)
|
||||||
|
description = self._og_search_description(webpage)
|
||||||
|
duration = float_or_none(item.get('duration'))
|
||||||
|
thumbnail = item.get('previewImageUrl')
|
||||||
|
|
||||||
|
subtitles = {}
|
||||||
|
subs = item.get('subtitles')
|
||||||
|
if subs:
|
||||||
|
subtitles['cs'] = subs[0]['url']
|
||||||
|
|
||||||
|
if self._downloader.params.get('listsubtitles', False):
|
||||||
|
self._list_available_subtitles(video_id, subtitles)
|
||||||
|
return
|
||||||
|
|
||||||
|
subtitles = self._fix_subtitles(self.extract_subtitles(video_id, subtitles))
|
||||||
|
|
||||||
return {
|
return {
|
||||||
'id': episode_id,
|
'id': episode_id,
|
||||||
'title': self._html_search_regex(r'<title>(.+?) — iVysílání — Česká televize</title>', webpage, 'title'),
|
'title': title,
|
||||||
'duration': float(duration),
|
'description': description,
|
||||||
|
'thumbnail': thumbnail,
|
||||||
|
'duration': duration,
|
||||||
'formats': formats,
|
'formats': formats,
|
||||||
|
'subtitles': subtitles,
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def _fix_subtitles(subtitles):
|
||||||
|
""" Convert millisecond-based subtitles to SRT """
|
||||||
|
if subtitles is None:
|
||||||
|
return subtitles # subtitles not requested
|
||||||
|
|
||||||
|
def _msectotimecode(msec):
|
||||||
|
""" Helper utility to convert milliseconds to timecode """
|
||||||
|
components = []
|
||||||
|
for divider in [1000, 60, 60, 100]:
|
||||||
|
components.append(msec % divider)
|
||||||
|
msec //= divider
|
||||||
|
return "{3:02}:{2:02}:{1:02},{0:03}".format(*components)
|
||||||
|
|
||||||
|
def _fix_subtitle(subtitle):
|
||||||
|
for line in subtitle.splitlines():
|
||||||
|
m = re.match(r"^\s*([0-9]+);\s*([0-9]+)\s+([0-9]+)\s*$", line)
|
||||||
|
if m:
|
||||||
|
yield m.group(1)
|
||||||
|
start, stop = (_msectotimecode(int(t)) for t in m.groups()[1:])
|
||||||
|
yield "{0} --> {1}".format(start, stop)
|
||||||
|
else:
|
||||||
|
yield line
|
||||||
|
|
||||||
|
fixed_subtitles = {}
|
||||||
|
for k, v in subtitles.items():
|
||||||
|
fixed_subtitles[k] = "\r\n".join(_fix_subtitle(v))
|
||||||
|
return fixed_subtitles
|
||||||
|
@ -236,16 +236,17 @@ class Channel9IE(InfoExtractor):
|
|||||||
if contents is None:
|
if contents is None:
|
||||||
return contents
|
return contents
|
||||||
|
|
||||||
session_meta = {'session_code': self._extract_session_code(html),
|
session_meta = {
|
||||||
'session_day': self._extract_session_day(html),
|
'session_code': self._extract_session_code(html),
|
||||||
'session_room': self._extract_session_room(html),
|
'session_day': self._extract_session_day(html),
|
||||||
'session_speakers': self._extract_session_speakers(html),
|
'session_room': self._extract_session_room(html),
|
||||||
}
|
'session_speakers': self._extract_session_speakers(html),
|
||||||
|
}
|
||||||
|
|
||||||
for content in contents:
|
for content in contents:
|
||||||
content.update(session_meta)
|
content.update(session_meta)
|
||||||
|
|
||||||
return contents
|
return self.playlist_result(contents)
|
||||||
|
|
||||||
def _extract_list(self, content_path):
|
def _extract_list(self, content_path):
|
||||||
rss = self._download_xml(self._RSS_URL % content_path, content_path, 'Downloading RSS')
|
rss = self._download_xml(self._RSS_URL % content_path, content_path, 'Downloading RSS')
|
||||||
|
50
youtube_dl/extractor/cinchcast.py
Normal file
50
youtube_dl/extractor/cinchcast.py
Normal file
@ -0,0 +1,50 @@
|
|||||||
|
# coding: utf-8
|
||||||
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
from .common import InfoExtractor
|
||||||
|
from ..utils import (
|
||||||
|
unified_strdate,
|
||||||
|
xpath_text,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
class CinchcastIE(InfoExtractor):
|
||||||
|
_VALID_URL = r'https?://player\.cinchcast\.com/.*?assetId=(?P<id>[0-9]+)'
|
||||||
|
_TEST = {
|
||||||
|
# Actual test is run in generic, look for undergroundwellness
|
||||||
|
'url': 'http://player.cinchcast.com/?platformId=1&assetType=single&assetId=7141703',
|
||||||
|
'only_matching': True,
|
||||||
|
}
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
video_id = self._match_id(url)
|
||||||
|
doc = self._download_xml(
|
||||||
|
'http://www.blogtalkradio.com/playerasset/mrss?assetType=single&assetId=%s' % video_id,
|
||||||
|
video_id)
|
||||||
|
|
||||||
|
item = doc.find('.//item')
|
||||||
|
title = xpath_text(item, './title', fatal=True)
|
||||||
|
date_str = xpath_text(
|
||||||
|
item, './{http://developer.longtailvideo.com/trac/}date')
|
||||||
|
upload_date = unified_strdate(date_str, day_first=False)
|
||||||
|
# duration is present but wrong
|
||||||
|
formats = [{
|
||||||
|
'format_id': 'main',
|
||||||
|
'url': item.find('./{http://search.yahoo.com/mrss/}content').attrib['url'],
|
||||||
|
}]
|
||||||
|
backup_url = xpath_text(
|
||||||
|
item, './{http://developer.longtailvideo.com/trac/}backupContent')
|
||||||
|
if backup_url:
|
||||||
|
formats.append({
|
||||||
|
'preference': 2, # seems to be more reliable
|
||||||
|
'format_id': 'backup',
|
||||||
|
'url': backup_url,
|
||||||
|
})
|
||||||
|
self._sort_formats(formats)
|
||||||
|
|
||||||
|
return {
|
||||||
|
'id': video_id,
|
||||||
|
'title': title,
|
||||||
|
'upload_date': upload_date,
|
||||||
|
'formats': formats,
|
||||||
|
}
|
@ -1,9 +1,7 @@
|
|||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
import json
|
|
||||||
import re
|
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
|
from ..utils import determine_ext
|
||||||
|
|
||||||
|
|
||||||
_translation_table = {
|
_translation_table = {
|
||||||
@ -27,10 +25,10 @@ class CliphunterIE(InfoExtractor):
|
|||||||
'''
|
'''
|
||||||
_TEST = {
|
_TEST = {
|
||||||
'url': 'http://www.cliphunter.com/w/1012420/Fun_Jynx_Maze_solo',
|
'url': 'http://www.cliphunter.com/w/1012420/Fun_Jynx_Maze_solo',
|
||||||
'md5': 'a2ba71eebf523859fe527a61018f723e',
|
'md5': 'b7c9bbd4eb3a226ab91093714dcaa480',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '1012420',
|
'id': '1012420',
|
||||||
'ext': 'mp4',
|
'ext': 'flv',
|
||||||
'title': 'Fun Jynx Maze solo',
|
'title': 'Fun Jynx Maze solo',
|
||||||
'thumbnail': 're:^https?://.*\.jpg$',
|
'thumbnail': 're:^https?://.*\.jpg$',
|
||||||
'age_limit': 18,
|
'age_limit': 18,
|
||||||
@ -44,39 +42,31 @@ class CliphunterIE(InfoExtractor):
|
|||||||
video_title = self._search_regex(
|
video_title = self._search_regex(
|
||||||
r'mediaTitle = "([^"]+)"', webpage, 'title')
|
r'mediaTitle = "([^"]+)"', webpage, 'title')
|
||||||
|
|
||||||
pl_fiji = self._search_regex(
|
fmts = {}
|
||||||
r'pl_fiji = \'([^\']+)\'', webpage, 'video data')
|
for fmt in ('mp4', 'flv'):
|
||||||
pl_c_qual = self._search_regex(
|
fmt_list = self._parse_json(self._search_regex(
|
||||||
r'pl_c_qual = "(.)"', webpage, 'video quality')
|
r'var %sjson\s*=\s*(\[.*?\]);' % fmt, webpage, '%s formats' % fmt), video_id)
|
||||||
video_url = _decode(pl_fiji)
|
for f in fmt_list:
|
||||||
formats = [{
|
fmts[f['fname']] = _decode(f['sUrl'])
|
||||||
'url': video_url,
|
|
||||||
'format_id': 'default-%s' % pl_c_qual,
|
|
||||||
}]
|
|
||||||
|
|
||||||
qualities_json = self._search_regex(
|
qualities = self._parse_json(self._search_regex(
|
||||||
r'var pl_qualities\s*=\s*(.*?);\n', webpage, 'quality info')
|
r'var player_btns\s*=\s*(.*?);\n', webpage, 'quality info'), video_id)
|
||||||
qualities_data = json.loads(qualities_json)
|
|
||||||
|
|
||||||
for i, t in enumerate(
|
formats = []
|
||||||
re.findall(r"pl_fiji_([a-z0-9]+)\s*=\s*'([^']+')", webpage)):
|
for fname, url in fmts.items():
|
||||||
quality_id, crypted_url = t
|
|
||||||
video_url = _decode(crypted_url)
|
|
||||||
f = {
|
f = {
|
||||||
'format_id': quality_id,
|
'url': url,
|
||||||
'url': video_url,
|
|
||||||
'quality': i,
|
|
||||||
}
|
}
|
||||||
if quality_id in qualities_data:
|
if fname in qualities:
|
||||||
qd = qualities_data[quality_id]
|
qual = qualities[fname]
|
||||||
m = re.match(
|
f.update({
|
||||||
r'''(?x)<b>(?P<width>[0-9]+)x(?P<height>[0-9]+)<\\/b>
|
'format_id': '%s_%sp' % (determine_ext(url), qual['h']),
|
||||||
\s*\(\s*(?P<tbr>[0-9]+)\s*kb\\/s''', qd)
|
'width': qual['w'],
|
||||||
if m:
|
'height': qual['h'],
|
||||||
f['width'] = int(m.group('width'))
|
'tbr': qual['br'],
|
||||||
f['height'] = int(m.group('height'))
|
})
|
||||||
f['tbr'] = int(m.group('tbr'))
|
|
||||||
formats.append(f)
|
formats.append(f)
|
||||||
|
|
||||||
self._sort_formats(formats)
|
self._sort_formats(formats)
|
||||||
|
|
||||||
thumbnail = self._search_regex(
|
thumbnail = self._search_regex(
|
||||||
|
@ -2,12 +2,10 @@
|
|||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
import json
|
import json
|
||||||
import re
|
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
ExtractorError,
|
ExtractorError,
|
||||||
int_or_none,
|
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
@ -15,23 +13,24 @@ class CNETIE(InfoExtractor):
|
|||||||
_VALID_URL = r'https?://(?:www\.)?cnet\.com/videos/(?P<id>[^/]+)/'
|
_VALID_URL = r'https?://(?:www\.)?cnet\.com/videos/(?P<id>[^/]+)/'
|
||||||
_TEST = {
|
_TEST = {
|
||||||
'url': 'http://www.cnet.com/videos/hands-on-with-microsofts-windows-8-1-update/',
|
'url': 'http://www.cnet.com/videos/hands-on-with-microsofts-windows-8-1-update/',
|
||||||
'md5': '041233212a0d06b179c87cbcca1577b8',
|
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '56f4ea68-bd21-4852-b08c-4de5b8354c60',
|
'id': '56f4ea68-bd21-4852-b08c-4de5b8354c60',
|
||||||
'ext': 'mp4',
|
'ext': 'flv',
|
||||||
'title': 'Hands-on with Microsoft Windows 8.1 Update',
|
'title': 'Hands-on with Microsoft Windows 8.1 Update',
|
||||||
'description': 'The new update to the Windows 8 OS brings improved performance for mouse and keyboard users.',
|
'description': 'The new update to the Windows 8 OS brings improved performance for mouse and keyboard users.',
|
||||||
'thumbnail': 're:^http://.*/flmswindows8.jpg$',
|
'thumbnail': 're:^http://.*/flmswindows8.jpg$',
|
||||||
'uploader_id': 'sarah.mitroff@cbsinteractive.com',
|
'uploader_id': '6085384d-619e-11e3-b231-14feb5ca9861',
|
||||||
'uploader': 'Sarah Mitroff',
|
'uploader': 'Sarah Mitroff',
|
||||||
|
},
|
||||||
|
'params': {
|
||||||
|
'skip_download': 'requires rtmpdump',
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
mobj = re.match(self._VALID_URL, url)
|
display_id = self._match_id(url)
|
||||||
display_id = mobj.group('id')
|
|
||||||
|
|
||||||
webpage = self._download_webpage(url, display_id)
|
webpage = self._download_webpage(url, display_id)
|
||||||
|
|
||||||
data_json = self._html_search_regex(
|
data_json = self._html_search_regex(
|
||||||
r"<div class=\"cnetVideoPlayer\"\s+.*?data-cnet-video-options='([^']+)'",
|
r"<div class=\"cnetVideoPlayer\"\s+.*?data-cnet-video-options='([^']+)'",
|
||||||
webpage, 'data json')
|
webpage, 'data json')
|
||||||
@ -42,37 +41,31 @@ class CNETIE(InfoExtractor):
|
|||||||
if not vdata:
|
if not vdata:
|
||||||
raise ExtractorError('Cannot find video data')
|
raise ExtractorError('Cannot find video data')
|
||||||
|
|
||||||
|
mpx_account = data['config']['players']['default']['mpx_account']
|
||||||
|
vid = vdata['files']['rtmp']
|
||||||
|
tp_link = 'http://link.theplatform.com/s/%s/%s' % (mpx_account, vid)
|
||||||
|
|
||||||
video_id = vdata['id']
|
video_id = vdata['id']
|
||||||
title = vdata.get('headline')
|
title = vdata.get('headline')
|
||||||
if title is None:
|
if title is None:
|
||||||
title = vdata.get('title')
|
title = vdata.get('title')
|
||||||
if title is None:
|
if title is None:
|
||||||
raise ExtractorError('Cannot find title!')
|
raise ExtractorError('Cannot find title!')
|
||||||
description = vdata.get('dek')
|
|
||||||
thumbnail = vdata.get('image', {}).get('path')
|
thumbnail = vdata.get('image', {}).get('path')
|
||||||
author = vdata.get('author')
|
author = vdata.get('author')
|
||||||
if author:
|
if author:
|
||||||
uploader = '%s %s' % (author['firstName'], author['lastName'])
|
uploader = '%s %s' % (author['firstName'], author['lastName'])
|
||||||
uploader_id = author.get('email')
|
uploader_id = author.get('id')
|
||||||
else:
|
else:
|
||||||
uploader = None
|
uploader = None
|
||||||
uploader_id = None
|
uploader_id = None
|
||||||
|
|
||||||
formats = [{
|
|
||||||
'format_id': '%s-%s-%s' % (
|
|
||||||
f['type'], f['format'],
|
|
||||||
int_or_none(f.get('bitrate'), 1000, default='')),
|
|
||||||
'url': f['uri'],
|
|
||||||
'tbr': int_or_none(f.get('bitrate'), 1000),
|
|
||||||
} for f in vdata['files']['data']]
|
|
||||||
self._sort_formats(formats)
|
|
||||||
|
|
||||||
return {
|
return {
|
||||||
|
'_type': 'url_transparent',
|
||||||
|
'url': tp_link,
|
||||||
'id': video_id,
|
'id': video_id,
|
||||||
'display_id': display_id,
|
'display_id': display_id,
|
||||||
'title': title,
|
'title': title,
|
||||||
'formats': formats,
|
|
||||||
'description': description,
|
|
||||||
'uploader': uploader,
|
'uploader': uploader,
|
||||||
'uploader_id': uploader_id,
|
'uploader_id': uploader_id,
|
||||||
'thumbnail': thumbnail,
|
'thumbnail': thumbnail,
|
||||||
|
@ -11,14 +11,14 @@ from ..utils import (
|
|||||||
|
|
||||||
|
|
||||||
class CNNIE(InfoExtractor):
|
class CNNIE(InfoExtractor):
|
||||||
_VALID_URL = r'''(?x)https?://((edition|www)\.)?cnn\.com/video/(data/.+?|\?)/
|
_VALID_URL = r'''(?x)https?://(?:(?:edition|www)\.)?cnn\.com/video/(?:data/.+?|\?)/
|
||||||
(?P<path>.+?/(?P<title>[^/]+?)(?:\.cnn(-ap)?|(?=&)))'''
|
(?P<path>.+?/(?P<title>[^/]+?)(?:\.(?:cnn|hln)(?:-ap)?|(?=&)))'''
|
||||||
|
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'http://edition.cnn.com/video/?/video/sports/2013/06/09/nadal-1-on-1.cnn',
|
'url': 'http://edition.cnn.com/video/?/video/sports/2013/06/09/nadal-1-on-1.cnn',
|
||||||
'md5': '3e6121ea48df7e2259fe73a0628605c4',
|
'md5': '3e6121ea48df7e2259fe73a0628605c4',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': 'sports_2013_06_09_nadal-1-on-1.cnn',
|
'id': 'sports/2013/06/09/nadal-1-on-1.cnn',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
'title': 'Nadal wins 8th French Open title',
|
'title': 'Nadal wins 8th French Open title',
|
||||||
'description': 'World Sport\'s Amanda Davies chats with 2013 French Open champion Rafael Nadal.',
|
'description': 'World Sport\'s Amanda Davies chats with 2013 French Open champion Rafael Nadal.',
|
||||||
@ -35,13 +35,23 @@ class CNNIE(InfoExtractor):
|
|||||||
"description": "A Georgia Tech student welcomes the incoming freshmen with an epic speech backed by music from \"2001: A Space Odyssey.\"",
|
"description": "A Georgia Tech student welcomes the incoming freshmen with an epic speech backed by music from \"2001: A Space Odyssey.\"",
|
||||||
"upload_date": "20130821",
|
"upload_date": "20130821",
|
||||||
}
|
}
|
||||||
|
}, {
|
||||||
|
'url': 'http://www.cnn.com/video/data/2.0/video/living/2014/12/22/growing-america-nashville-salemtown-board-episode-1.hln.html',
|
||||||
|
'md5': 'f14d02ebd264df951feb2400e2c25a1b',
|
||||||
|
'info_dict': {
|
||||||
|
'id': 'living/2014/12/22/growing-america-nashville-salemtown-board-episode-1.hln',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'Nashville Ep. 1: Hand crafted skateboards',
|
||||||
|
'description': 'md5:e7223a503315c9f150acac52e76de086',
|
||||||
|
'upload_date': '20141222',
|
||||||
|
}
|
||||||
}]
|
}]
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
mobj = re.match(self._VALID_URL, url)
|
mobj = re.match(self._VALID_URL, url)
|
||||||
path = mobj.group('path')
|
path = mobj.group('path')
|
||||||
page_title = mobj.group('title')
|
page_title = mobj.group('title')
|
||||||
info_url = 'http://cnn.com/video/data/3.0/%s/index.xml' % path
|
info_url = 'http://edition.cnn.com/video/data/3.0/%s/index.xml' % path
|
||||||
info = self._download_xml(info_url, page_title)
|
info = self._download_xml(info_url, page_title)
|
||||||
|
|
||||||
formats = []
|
formats = []
|
||||||
@ -127,3 +137,28 @@ class CNNBlogsIE(InfoExtractor):
|
|||||||
'url': cnn_url,
|
'url': cnn_url,
|
||||||
'ie_key': CNNIE.ie_key(),
|
'ie_key': CNNIE.ie_key(),
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
|
class CNNArticleIE(InfoExtractor):
|
||||||
|
_VALID_URL = r'https?://(?:(?:edition|www)\.)?cnn\.com/(?!video/)'
|
||||||
|
_TEST = {
|
||||||
|
'url': 'http://www.cnn.com/2014/12/21/politics/obama-north-koreas-hack-not-war-but-cyber-vandalism/',
|
||||||
|
'md5': '689034c2a3d9c6dc4aa72d65a81efd01',
|
||||||
|
'info_dict': {
|
||||||
|
'id': 'bestoftv/2014/12/21/ip-north-korea-obama.cnn',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'Obama: Cyberattack not an act of war',
|
||||||
|
'description': 'md5:51ce6750450603795cad0cdfbd7d05c5',
|
||||||
|
'upload_date': '20141221',
|
||||||
|
},
|
||||||
|
'add_ie': ['CNN'],
|
||||||
|
}
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
webpage = self._download_webpage(url, url_basename(url))
|
||||||
|
cnn_url = self._html_search_regex(r"video:\s*'([^']+)'", webpage, 'cnn url')
|
||||||
|
return {
|
||||||
|
'_type': 'url',
|
||||||
|
'url': 'http://cnn.com/video/?/video/' + cnn_url,
|
||||||
|
'ie_key': CNNIE.ie_key(),
|
||||||
|
}
|
||||||
|
92
youtube_dl/extractor/collegerama.py
Normal file
92
youtube_dl/extractor/collegerama.py
Normal file
@ -0,0 +1,92 @@
|
|||||||
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
import json
|
||||||
|
|
||||||
|
from .common import InfoExtractor
|
||||||
|
from ..compat import compat_urllib_request
|
||||||
|
from ..utils import (
|
||||||
|
float_or_none,
|
||||||
|
int_or_none,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
class CollegeRamaIE(InfoExtractor):
|
||||||
|
_VALID_URL = r'https?://collegerama\.tudelft\.nl/Mediasite/Play/(?P<id>[\da-f]+)'
|
||||||
|
_TESTS = [
|
||||||
|
{
|
||||||
|
'url': 'https://collegerama.tudelft.nl/Mediasite/Play/585a43626e544bdd97aeb71a0ec907a01d',
|
||||||
|
'md5': '481fda1c11f67588c0d9d8fbdced4e39',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '585a43626e544bdd97aeb71a0ec907a01d',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'Een nieuwe wereld: waarden, bewustzijn en techniek van de mensheid 2.0.',
|
||||||
|
'description': '',
|
||||||
|
'thumbnail': 're:^https?://.*\.jpg$',
|
||||||
|
'duration': 7713.088,
|
||||||
|
'timestamp': 1413309600,
|
||||||
|
'upload_date': '20141014',
|
||||||
|
},
|
||||||
|
},
|
||||||
|
{
|
||||||
|
'url': 'https://collegerama.tudelft.nl/Mediasite/Play/86a9ea9f53e149079fbdb4202b521ed21d?catalog=fd32fd35-6c99-466c-89d4-cd3c431bc8a4',
|
||||||
|
'md5': 'ef1fdded95bdf19b12c5999949419c92',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '86a9ea9f53e149079fbdb4202b521ed21d',
|
||||||
|
'ext': 'wmv',
|
||||||
|
'title': '64ste Vakantiecursus: Afvalwater',
|
||||||
|
'description': 'md5:7fd774865cc69d972f542b157c328305',
|
||||||
|
'duration': 10853,
|
||||||
|
'timestamp': 1326446400,
|
||||||
|
'upload_date': '20120113',
|
||||||
|
},
|
||||||
|
},
|
||||||
|
]
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
video_id = self._match_id(url)
|
||||||
|
|
||||||
|
player_options_request = {
|
||||||
|
"getPlayerOptionsRequest": {
|
||||||
|
"ResourceId": video_id,
|
||||||
|
"QueryString": "",
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
request = compat_urllib_request.Request(
|
||||||
|
'http://collegerama.tudelft.nl/Mediasite/PlayerService/PlayerService.svc/json/GetPlayerOptions',
|
||||||
|
json.dumps(player_options_request))
|
||||||
|
request.add_header('Content-Type', 'application/json')
|
||||||
|
|
||||||
|
player_options = self._download_json(request, video_id)
|
||||||
|
|
||||||
|
presentation = player_options['d']['Presentation']
|
||||||
|
title = presentation['Title']
|
||||||
|
description = presentation.get('Description')
|
||||||
|
thumbnail = None
|
||||||
|
duration = float_or_none(presentation.get('Duration'), 1000)
|
||||||
|
timestamp = int_or_none(presentation.get('UnixTime'), 1000)
|
||||||
|
|
||||||
|
formats = []
|
||||||
|
for stream in presentation['Streams']:
|
||||||
|
for video in stream['VideoUrls']:
|
||||||
|
thumbnail_url = stream.get('ThumbnailUrl')
|
||||||
|
if thumbnail_url:
|
||||||
|
thumbnail = 'http://collegerama.tudelft.nl' + thumbnail_url
|
||||||
|
format_id = video['MediaType']
|
||||||
|
if format_id == 'SS':
|
||||||
|
continue
|
||||||
|
formats.append({
|
||||||
|
'url': video['Location'],
|
||||||
|
'format_id': format_id,
|
||||||
|
})
|
||||||
|
self._sort_formats(formats)
|
||||||
|
|
||||||
|
return {
|
||||||
|
'id': video_id,
|
||||||
|
'title': title,
|
||||||
|
'description': description,
|
||||||
|
'thumbnail': thumbnail,
|
||||||
|
'duration': duration,
|
||||||
|
'timestamp': timestamp,
|
||||||
|
'formats': formats,
|
||||||
|
}
|
57
youtube_dl/extractor/comcarcoff.py
Normal file
57
youtube_dl/extractor/comcarcoff.py
Normal file
@ -0,0 +1,57 @@
|
|||||||
|
# encoding: utf-8
|
||||||
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
import json
|
||||||
|
|
||||||
|
from .common import InfoExtractor
|
||||||
|
from ..utils import parse_iso8601
|
||||||
|
|
||||||
|
|
||||||
|
class ComCarCoffIE(InfoExtractor):
|
||||||
|
_VALID_URL = r'http://(?:www\.)?comediansincarsgettingcoffee\.com/(?P<id>[a-z0-9\-]*)'
|
||||||
|
_TESTS = [{
|
||||||
|
'url': 'http://comediansincarsgettingcoffee.com/miranda-sings-happy-thanksgiving-miranda/',
|
||||||
|
'info_dict': {
|
||||||
|
'id': 'miranda-sings-happy-thanksgiving-miranda',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'upload_date': '20141127',
|
||||||
|
'timestamp': 1417107600,
|
||||||
|
'title': 'Happy Thanksgiving Miranda',
|
||||||
|
'description': 'Jerry Seinfeld and his special guest Miranda Sings cruise around town in search of coffee, complaining and apologizing along the way.',
|
||||||
|
'thumbnail': 'http://ccc.crackle.com/images/s5e4_thumb.jpg',
|
||||||
|
},
|
||||||
|
'params': {
|
||||||
|
'skip_download': 'requires ffmpeg',
|
||||||
|
}
|
||||||
|
}]
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
display_id = self._match_id(url)
|
||||||
|
if not display_id:
|
||||||
|
display_id = 'comediansincarsgettingcoffee.com'
|
||||||
|
webpage = self._download_webpage(url, display_id)
|
||||||
|
|
||||||
|
full_data = json.loads(self._search_regex(
|
||||||
|
r'<script type="application/json" id="videoData">(?P<json>.+?)</script>',
|
||||||
|
webpage, 'full data json'))
|
||||||
|
|
||||||
|
video_id = full_data['activeVideo']['video']
|
||||||
|
video_data = full_data['videos'][video_id]
|
||||||
|
thumbnails = [{
|
||||||
|
'url': video_data['images']['thumb'],
|
||||||
|
}, {
|
||||||
|
'url': video_data['images']['poster'],
|
||||||
|
}]
|
||||||
|
formats = self._extract_m3u8_formats(
|
||||||
|
video_data['mediaUrl'], video_id, ext='mp4')
|
||||||
|
|
||||||
|
return {
|
||||||
|
'id': video_id,
|
||||||
|
'display_id': display_id,
|
||||||
|
'title': video_data['title'],
|
||||||
|
'description': video_data.get('description'),
|
||||||
|
'timestamp': parse_iso8601(video_data.get('pubDate')),
|
||||||
|
'thumbnails': thumbnails,
|
||||||
|
'formats': formats,
|
||||||
|
'webpage_url': 'http://comediansincarsgettingcoffee.com/%s' % (video_data.get('urlSlug', video_data.get('slug'))),
|
||||||
|
}
|
@ -3,9 +3,11 @@ from __future__ import unicode_literals
|
|||||||
import re
|
import re
|
||||||
|
|
||||||
from .mtv import MTVServicesInfoExtractor
|
from .mtv import MTVServicesInfoExtractor
|
||||||
from ..utils import (
|
from ..compat import (
|
||||||
compat_str,
|
compat_str,
|
||||||
compat_urllib_parse,
|
compat_urllib_parse,
|
||||||
|
)
|
||||||
|
from ..utils import (
|
||||||
ExtractorError,
|
ExtractorError,
|
||||||
float_or_none,
|
float_or_none,
|
||||||
unified_strdate,
|
unified_strdate,
|
||||||
@ -32,12 +34,12 @@ class ComedyCentralIE(MTVServicesInfoExtractor):
|
|||||||
|
|
||||||
class ComedyCentralShowsIE(MTVServicesInfoExtractor):
|
class ComedyCentralShowsIE(MTVServicesInfoExtractor):
|
||||||
IE_DESC = 'The Daily Show / The Colbert Report'
|
IE_DESC = 'The Daily Show / The Colbert Report'
|
||||||
# urls can be abbreviations like :thedailyshow or :colbert
|
# urls can be abbreviations like :thedailyshow
|
||||||
# urls for episodes like:
|
# urls for episodes like:
|
||||||
# or urls for clips like: http://www.thedailyshow.com/watch/mon-december-10-2012/any-given-gun-day
|
# or urls for clips like: http://www.thedailyshow.com/watch/mon-december-10-2012/any-given-gun-day
|
||||||
# or: http://www.colbertnation.com/the-colbert-report-videos/421667/november-29-2012/moon-shattering-news
|
# or: http://www.colbertnation.com/the-colbert-report-videos/421667/november-29-2012/moon-shattering-news
|
||||||
# or: http://www.colbertnation.com/the-colbert-report-collections/422008/festival-of-lights/79524
|
# or: http://www.colbertnation.com/the-colbert-report-collections/422008/festival-of-lights/79524
|
||||||
_VALID_URL = r'''(?x)^(:(?P<shortname>tds|thedailyshow|cr|colbert|colbertnation|colbertreport)
|
_VALID_URL = r'''(?x)^(:(?P<shortname>tds|thedailyshow)
|
||||||
|https?://(:www\.)?
|
|https?://(:www\.)?
|
||||||
(?P<showname>thedailyshow|thecolbertreport)\.(?:cc\.)?com/
|
(?P<showname>thedailyshow|thecolbertreport)\.(?:cc\.)?com/
|
||||||
((?:full-)?episodes/(?:[0-9a-z]{6}/)?(?P<episode>.*)|
|
((?:full-)?episodes/(?:[0-9a-z]{6}/)?(?P<episode>.*)|
|
||||||
@ -47,8 +49,10 @@ class ComedyCentralShowsIE(MTVServicesInfoExtractor):
|
|||||||
|(watch/(?P<date>[^/]*)/(?P<tdstitle>.*))
|
|(watch/(?P<date>[^/]*)/(?P<tdstitle>.*))
|
||||||
)|
|
)|
|
||||||
(?P<interview>
|
(?P<interview>
|
||||||
extended-interviews/(?P<interID>[0-9a-z]+)/(?:playlist_tds_extended_)?(?P<interview_title>.*?)(/.*?)?)))
|
extended-interviews/(?P<interID>[0-9a-z]+)/
|
||||||
(?:[?#].*|$)'''
|
(?:playlist_tds_extended_)?(?P<interview_title>[^/?#]*?)
|
||||||
|
(?:/[^/?#]?|[?#]|$))))
|
||||||
|
'''
|
||||||
_TESTS = [{
|
_TESTS = [{
|
||||||
'url': 'http://thedailyshow.cc.com/watch/thu-december-13-2012/kristen-stewart',
|
'url': 'http://thedailyshow.cc.com/watch/thu-december-13-2012/kristen-stewart',
|
||||||
'md5': '4e2f5cb088a83cd8cdb7756132f9739d',
|
'md5': '4e2f5cb088a83cd8cdb7756132f9739d',
|
||||||
@ -60,6 +64,38 @@ class ComedyCentralShowsIE(MTVServicesInfoExtractor):
|
|||||||
'uploader': 'thedailyshow',
|
'uploader': 'thedailyshow',
|
||||||
'title': 'thedailyshow kristen-stewart part 1',
|
'title': 'thedailyshow kristen-stewart part 1',
|
||||||
}
|
}
|
||||||
|
}, {
|
||||||
|
'url': 'http://thedailyshow.cc.com/extended-interviews/b6364d/sarah-chayes-extended-interview',
|
||||||
|
'info_dict': {
|
||||||
|
'id': 'sarah-chayes-extended-interview',
|
||||||
|
'description': 'Carnegie Endowment Senior Associate Sarah Chayes discusses how corrupt institutions function throughout the world in her book "Thieves of State: Why Corruption Threatens Global Security."',
|
||||||
|
'title': 'thedailyshow Sarah Chayes Extended Interview',
|
||||||
|
},
|
||||||
|
'playlist': [
|
||||||
|
{
|
||||||
|
'info_dict': {
|
||||||
|
'id': '0baad492-cbec-4ec1-9e50-ad91c291127f',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'upload_date': '20150129',
|
||||||
|
'description': 'Carnegie Endowment Senior Associate Sarah Chayes discusses how corrupt institutions function throughout the world in her book "Thieves of State: Why Corruption Threatens Global Security."',
|
||||||
|
'uploader': 'thedailyshow',
|
||||||
|
'title': 'thedailyshow sarah-chayes-extended-interview part 1',
|
||||||
|
},
|
||||||
|
},
|
||||||
|
{
|
||||||
|
'info_dict': {
|
||||||
|
'id': '1e4fb91b-8ce7-4277-bd7c-98c9f1bbd283',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'upload_date': '20150129',
|
||||||
|
'description': 'Carnegie Endowment Senior Associate Sarah Chayes discusses how corrupt institutions function throughout the world in her book "Thieves of State: Why Corruption Threatens Global Security."',
|
||||||
|
'uploader': 'thedailyshow',
|
||||||
|
'title': 'thedailyshow sarah-chayes-extended-interview part 2',
|
||||||
|
},
|
||||||
|
},
|
||||||
|
],
|
||||||
|
'params': {
|
||||||
|
'skip_download': True,
|
||||||
|
},
|
||||||
}, {
|
}, {
|
||||||
'url': 'http://thedailyshow.cc.com/extended-interviews/xm3fnq/andrew-napolitano-extended-interview',
|
'url': 'http://thedailyshow.cc.com/extended-interviews/xm3fnq/andrew-napolitano-extended-interview',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
@ -81,6 +117,9 @@ class ComedyCentralShowsIE(MTVServicesInfoExtractor):
|
|||||||
}, {
|
}, {
|
||||||
'url': 'http://thedailyshow.cc.com/video-playlists/npde3s/the-daily-show-19088-highlights',
|
'url': 'http://thedailyshow.cc.com/video-playlists/npde3s/the-daily-show-19088-highlights',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
'url': 'http://thedailyshow.cc.com/video-playlists/t6d9sg/the-daily-show-20038-highlights/be3cwo',
|
||||||
|
'only_matching': True,
|
||||||
}, {
|
}, {
|
||||||
'url': 'http://thedailyshow.cc.com/special-editions/2l8fdb/special-edition---a-look-back-at-food',
|
'url': 'http://thedailyshow.cc.com/special-editions/2l8fdb/special-edition---a-look-back-at-food',
|
||||||
'only_matching': True,
|
'only_matching': True,
|
||||||
@ -234,6 +273,7 @@ class ComedyCentralShowsIE(MTVServicesInfoExtractor):
|
|||||||
|
|
||||||
return {
|
return {
|
||||||
'_type': 'playlist',
|
'_type': 'playlist',
|
||||||
|
'id': epTitle,
|
||||||
'entries': entries,
|
'entries': entries,
|
||||||
'title': playlist_title,
|
'title': playlist_title,
|
||||||
'description': description,
|
'description': description,
|
||||||
|
@ -15,6 +15,7 @@ import xml.etree.ElementTree
|
|||||||
|
|
||||||
from ..compat import (
|
from ..compat import (
|
||||||
compat_cookiejar,
|
compat_cookiejar,
|
||||||
|
compat_HTTPError,
|
||||||
compat_http_client,
|
compat_http_client,
|
||||||
compat_urllib_error,
|
compat_urllib_error,
|
||||||
compat_urllib_parse_urlparse,
|
compat_urllib_parse_urlparse,
|
||||||
@ -22,6 +23,7 @@ from ..compat import (
|
|||||||
compat_str,
|
compat_str,
|
||||||
)
|
)
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
|
age_restricted,
|
||||||
clean_html,
|
clean_html,
|
||||||
compiled_regex_type,
|
compiled_regex_type,
|
||||||
ExtractorError,
|
ExtractorError,
|
||||||
@ -41,7 +43,7 @@ class InfoExtractor(object):
|
|||||||
information about the video (or videos) the URL refers to. This
|
information about the video (or videos) the URL refers to. This
|
||||||
information includes the real video URL, the video title, author and
|
information includes the real video URL, the video title, author and
|
||||||
others. The information is stored in a dictionary which is then
|
others. The information is stored in a dictionary which is then
|
||||||
passed to the FileDownloader. The FileDownloader processes this
|
passed to the YoutubeDL. The YoutubeDL processes this
|
||||||
information possibly downloading the video to the file system, among
|
information possibly downloading the video to the file system, among
|
||||||
other possible outcomes.
|
other possible outcomes.
|
||||||
|
|
||||||
@ -87,12 +89,15 @@ class InfoExtractor(object):
|
|||||||
* player_url SWF Player URL (used for rtmpdump).
|
* player_url SWF Player URL (used for rtmpdump).
|
||||||
* protocol The protocol that will be used for the actual
|
* protocol The protocol that will be used for the actual
|
||||||
download, lower-case.
|
download, lower-case.
|
||||||
"http", "https", "rtsp", "rtmp", "m3u8" or so.
|
"http", "https", "rtsp", "rtmp", "rtmpe",
|
||||||
|
"m3u8", or "m3u8_native".
|
||||||
* preference Order number of this format. If this field is
|
* preference Order number of this format. If this field is
|
||||||
present and not None, the formats get sorted
|
present and not None, the formats get sorted
|
||||||
by this field, regardless of all other values.
|
by this field, regardless of all other values.
|
||||||
-1 for default (order by other properties),
|
-1 for default (order by other properties),
|
||||||
-2 or smaller for less than default.
|
-2 or smaller for less than default.
|
||||||
|
< -1000 to hide the format (if there is
|
||||||
|
another one which is strictly better)
|
||||||
* language_preference Is this in the correct requested
|
* language_preference Is this in the correct requested
|
||||||
language?
|
language?
|
||||||
10 if it's what the URL is about,
|
10 if it's what the URL is about,
|
||||||
@ -106,12 +111,17 @@ class InfoExtractor(object):
|
|||||||
(quality takes higher priority)
|
(quality takes higher priority)
|
||||||
-1 for default (order by other properties),
|
-1 for default (order by other properties),
|
||||||
-2 or smaller for less than default.
|
-2 or smaller for less than default.
|
||||||
* http_referer HTTP Referer header value to set.
|
|
||||||
* http_method HTTP method to use for the download.
|
* http_method HTTP method to use for the download.
|
||||||
* http_headers A dictionary of additional HTTP headers
|
* http_headers A dictionary of additional HTTP headers
|
||||||
to add to the request.
|
to add to the request.
|
||||||
* http_post_data Additional data to send with a POST
|
* http_post_data Additional data to send with a POST
|
||||||
request.
|
request.
|
||||||
|
* stretched_ratio If given and not 1, indicates that the
|
||||||
|
video's pixels are not square.
|
||||||
|
width : height ratio as float.
|
||||||
|
* no_resume The server does not support resuming the
|
||||||
|
(HTTP or RTMP) download. Boolean.
|
||||||
|
|
||||||
url: Final video URL.
|
url: Final video URL.
|
||||||
ext: Video filename extension.
|
ext: Video filename extension.
|
||||||
format: The video format, defaults to ext (used for --get-format)
|
format: The video format, defaults to ext (used for --get-format)
|
||||||
@ -119,19 +129,23 @@ class InfoExtractor(object):
|
|||||||
|
|
||||||
The following fields are optional:
|
The following fields are optional:
|
||||||
|
|
||||||
|
alt_title: A secondary title of the video.
|
||||||
display_id An alternative identifier for the video, not necessarily
|
display_id An alternative identifier for the video, not necessarily
|
||||||
unique, but available before title. Typically, id is
|
unique, but available before title. Typically, id is
|
||||||
something like "4234987", title "Dancing naked mole rats",
|
something like "4234987", title "Dancing naked mole rats",
|
||||||
and display_id "dancing-naked-mole-rats"
|
and display_id "dancing-naked-mole-rats"
|
||||||
thumbnails: A list of dictionaries, with the following entries:
|
thumbnails: A list of dictionaries, with the following entries:
|
||||||
|
* "id" (optional, string) - Thumbnail format ID
|
||||||
* "url"
|
* "url"
|
||||||
|
* "preference" (optional, int) - quality of the image
|
||||||
* "width" (optional, int)
|
* "width" (optional, int)
|
||||||
* "height" (optional, int)
|
* "height" (optional, int)
|
||||||
* "resolution" (optional, string "{width}x{height"},
|
* "resolution" (optional, string "{width}x{height"},
|
||||||
deprecated)
|
deprecated)
|
||||||
thumbnail: Full URL to a video thumbnail image.
|
thumbnail: Full URL to a video thumbnail image.
|
||||||
description: One-line video description.
|
description: Full video description.
|
||||||
uploader: Full name of the video uploader.
|
uploader: Full name of the video uploader.
|
||||||
|
creator: The main artist who created the video.
|
||||||
timestamp: UNIX timestamp of the moment the video became available.
|
timestamp: UNIX timestamp of the moment the video became available.
|
||||||
upload_date: Video upload date (YYYYMMDD).
|
upload_date: Video upload date (YYYYMMDD).
|
||||||
If not explicitly set, calculated from timestamp.
|
If not explicitly set, calculated from timestamp.
|
||||||
@ -143,7 +157,19 @@ class InfoExtractor(object):
|
|||||||
view_count: How many users have watched the video on the platform.
|
view_count: How many users have watched the video on the platform.
|
||||||
like_count: Number of positive ratings of the video
|
like_count: Number of positive ratings of the video
|
||||||
dislike_count: Number of negative ratings of the video
|
dislike_count: Number of negative ratings of the video
|
||||||
|
average_rating: Average rating give by users, the scale used depends on the webpage
|
||||||
comment_count: Number of comments on the video
|
comment_count: Number of comments on the video
|
||||||
|
comments: A list of comments, each with one or more of the following
|
||||||
|
properties (all but one of text or html optional):
|
||||||
|
* "author" - human-readable name of the comment author
|
||||||
|
* "author_id" - user ID of the comment author
|
||||||
|
* "id" - Comment ID
|
||||||
|
* "html" - Comment as HTML
|
||||||
|
* "text" - Plain text of the comment
|
||||||
|
* "timestamp" - UNIX timestamp of comment
|
||||||
|
* "parent" - ID of the comment this one is replying to.
|
||||||
|
Set to "root" to indicate that this is a
|
||||||
|
comment to the original video.
|
||||||
parts: A list of info_dicts for each of the parts of the video,
|
parts: A list of info_dicts for each of the parts of the video,
|
||||||
it must include the url field, if it's a rtmp download it
|
it must include the url field, if it's a rtmp download it
|
||||||
can contain additional fields for rtmpdump.
|
can contain additional fields for rtmpdump.
|
||||||
@ -162,8 +188,8 @@ class InfoExtractor(object):
|
|||||||
|
|
||||||
|
|
||||||
_type "playlist" indicates multiple videos.
|
_type "playlist" indicates multiple videos.
|
||||||
There must be a key "entries", which is a list or a PagedList object, each
|
There must be a key "entries", which is a list, an iterable, or a PagedList
|
||||||
element of which is a valid dictionary under this specfication.
|
object, each element of which is a valid dictionary by this specification.
|
||||||
|
|
||||||
Additionally, playlists can have "title" and "id" attributes with the same
|
Additionally, playlists can have "title" and "id" attributes with the same
|
||||||
semantics as videos (see above).
|
semantics as videos (see above).
|
||||||
@ -178,9 +204,10 @@ class InfoExtractor(object):
|
|||||||
_type "url" indicates that the video must be extracted from another
|
_type "url" indicates that the video must be extracted from another
|
||||||
location, possibly by a different extractor. Its only required key is:
|
location, possibly by a different extractor. Its only required key is:
|
||||||
"url" - the next URL to extract.
|
"url" - the next URL to extract.
|
||||||
|
The key "ie_key" can be set to the class name (minus the trailing "IE",
|
||||||
Additionally, it may have properties believed to be identical to the
|
e.g. "Youtube") if the extractor class is known in advance.
|
||||||
resolved entity, for example "title" if the title of the referred video is
|
Additionally, the dictionary may have any properties of the resolved entity
|
||||||
|
known in advance, for example "title" if the title of the referred video is
|
||||||
known ahead of time.
|
known ahead of time.
|
||||||
|
|
||||||
|
|
||||||
@ -241,8 +268,15 @@ class InfoExtractor(object):
|
|||||||
|
|
||||||
def extract(self, url):
|
def extract(self, url):
|
||||||
"""Extracts URL information and returns it in list of dicts."""
|
"""Extracts URL information and returns it in list of dicts."""
|
||||||
self.initialize()
|
try:
|
||||||
return self._real_extract(url)
|
self.initialize()
|
||||||
|
return self._real_extract(url)
|
||||||
|
except ExtractorError:
|
||||||
|
raise
|
||||||
|
except compat_http_client.IncompleteRead as e:
|
||||||
|
raise ExtractorError('A network error has occured.', cause=e, expected=True)
|
||||||
|
except (KeyError, StopIteration) as e:
|
||||||
|
raise ExtractorError('An extractor error has occured.', cause=e)
|
||||||
|
|
||||||
def set_downloader(self, downloader):
|
def set_downloader(self, downloader):
|
||||||
"""Sets the downloader for this IE."""
|
"""Sets the downloader for this IE."""
|
||||||
@ -364,9 +398,19 @@ class InfoExtractor(object):
|
|||||||
|
|
||||||
return content
|
return content
|
||||||
|
|
||||||
def _download_webpage(self, url_or_request, video_id, note=None, errnote=None, fatal=True):
|
def _download_webpage(self, url_or_request, video_id, note=None, errnote=None, fatal=True, tries=1, timeout=5):
|
||||||
""" Returns the data of the page as a string """
|
""" Returns the data of the page as a string """
|
||||||
res = self._download_webpage_handle(url_or_request, video_id, note, errnote, fatal)
|
success = False
|
||||||
|
try_count = 0
|
||||||
|
while success is False:
|
||||||
|
try:
|
||||||
|
res = self._download_webpage_handle(url_or_request, video_id, note, errnote, fatal)
|
||||||
|
success = True
|
||||||
|
except compat_http_client.IncompleteRead as e:
|
||||||
|
try_count += 1
|
||||||
|
if try_count >= tries:
|
||||||
|
raise e
|
||||||
|
self._sleep(timeout, video_id)
|
||||||
if res is False:
|
if res is False:
|
||||||
return res
|
return res
|
||||||
else:
|
else:
|
||||||
@ -394,6 +438,10 @@ class InfoExtractor(object):
|
|||||||
url_or_request, video_id, note, errnote, fatal=fatal)
|
url_or_request, video_id, note, errnote, fatal=fatal)
|
||||||
if (not fatal) and json_string is False:
|
if (not fatal) and json_string is False:
|
||||||
return None
|
return None
|
||||||
|
return self._parse_json(
|
||||||
|
json_string, video_id, transform_source=transform_source, fatal=fatal)
|
||||||
|
|
||||||
|
def _parse_json(self, json_string, video_id, transform_source=None, fatal=True):
|
||||||
if transform_source:
|
if transform_source:
|
||||||
json_string = transform_source(json_string)
|
json_string = transform_source(json_string)
|
||||||
try:
|
try:
|
||||||
@ -443,7 +491,7 @@ class InfoExtractor(object):
|
|||||||
return video_info
|
return video_info
|
||||||
|
|
||||||
@staticmethod
|
@staticmethod
|
||||||
def playlist_result(entries, playlist_id=None, playlist_title=None):
|
def playlist_result(entries, playlist_id=None, playlist_title=None, playlist_description=None):
|
||||||
"""Returns a playlist"""
|
"""Returns a playlist"""
|
||||||
video_info = {'_type': 'playlist',
|
video_info = {'_type': 'playlist',
|
||||||
'entries': entries}
|
'entries': entries}
|
||||||
@ -451,6 +499,8 @@ class InfoExtractor(object):
|
|||||||
video_info['id'] = playlist_id
|
video_info['id'] = playlist_id
|
||||||
if playlist_title:
|
if playlist_title:
|
||||||
video_info['title'] = playlist_title
|
video_info['title'] = playlist_title
|
||||||
|
if playlist_description:
|
||||||
|
video_info['description'] = playlist_description
|
||||||
return video_info
|
return video_info
|
||||||
|
|
||||||
def _search_regex(self, pattern, string, name, default=_NO_DEFAULT, fatal=True, flags=0, group=None):
|
def _search_regex(self, pattern, string, name, default=_NO_DEFAULT, fatal=True, flags=0, group=None):
|
||||||
@ -468,7 +518,7 @@ class InfoExtractor(object):
|
|||||||
if mobj:
|
if mobj:
|
||||||
break
|
break
|
||||||
|
|
||||||
if os.name != 'nt' and sys.stderr.isatty():
|
if not self._downloader.params.get('no_color') and os.name != 'nt' and sys.stderr.isatty():
|
||||||
_name = '\033[0;34m%s\033[0m' % name
|
_name = '\033[0;34m%s\033[0m' % name
|
||||||
else:
|
else:
|
||||||
_name = name
|
_name = name
|
||||||
@ -585,9 +635,9 @@ class InfoExtractor(object):
|
|||||||
if display_name is None:
|
if display_name is None:
|
||||||
display_name = name
|
display_name = name
|
||||||
return self._html_search_regex(
|
return self._html_search_regex(
|
||||||
r'''(?ix)<meta
|
r'''(?isx)<meta
|
||||||
(?=[^>]+(?:itemprop|name|property)=(["\']?)%s\1)
|
(?=[^>]+(?:itemprop|name|property)=(["\']?)%s\1)
|
||||||
[^>]+content=(["\'])(?P<content>.*?)\1''' % re.escape(name),
|
[^>]+?content=(["\'])(?P<content>.*?)\2''' % re.escape(name),
|
||||||
html, display_name, fatal=fatal, group='content', **kwargs)
|
html, display_name, fatal=fatal, group='content', **kwargs)
|
||||||
|
|
||||||
def _dc_search_uploader(self, html):
|
def _dc_search_uploader(self, html):
|
||||||
@ -617,6 +667,21 @@ class InfoExtractor(object):
|
|||||||
}
|
}
|
||||||
return RATING_TABLE.get(rating.lower(), None)
|
return RATING_TABLE.get(rating.lower(), None)
|
||||||
|
|
||||||
|
def _family_friendly_search(self, html):
|
||||||
|
# See http://schema.org/VideoObject
|
||||||
|
family_friendly = self._html_search_meta('isFamilyFriendly', html)
|
||||||
|
|
||||||
|
if not family_friendly:
|
||||||
|
return None
|
||||||
|
|
||||||
|
RATING_TABLE = {
|
||||||
|
'1': 0,
|
||||||
|
'true': 0,
|
||||||
|
'0': 18,
|
||||||
|
'false': 18,
|
||||||
|
}
|
||||||
|
return RATING_TABLE.get(family_friendly.lower(), None)
|
||||||
|
|
||||||
def _twitter_search_player(self, html):
|
def _twitter_search_player(self, html):
|
||||||
return self._html_search_meta('twitter:player', html,
|
return self._html_search_meta('twitter:player', html,
|
||||||
'twitter card player')
|
'twitter card player')
|
||||||
@ -666,21 +731,40 @@ class InfoExtractor(object):
|
|||||||
preference,
|
preference,
|
||||||
f.get('language_preference') if f.get('language_preference') is not None else -1,
|
f.get('language_preference') if f.get('language_preference') is not None else -1,
|
||||||
f.get('quality') if f.get('quality') is not None else -1,
|
f.get('quality') if f.get('quality') is not None else -1,
|
||||||
|
f.get('tbr') if f.get('tbr') is not None else -1,
|
||||||
|
f.get('filesize') if f.get('filesize') is not None else -1,
|
||||||
|
f.get('vbr') if f.get('vbr') is not None else -1,
|
||||||
f.get('height') if f.get('height') is not None else -1,
|
f.get('height') if f.get('height') is not None else -1,
|
||||||
f.get('width') if f.get('width') is not None else -1,
|
f.get('width') if f.get('width') is not None else -1,
|
||||||
ext_preference,
|
ext_preference,
|
||||||
f.get('tbr') if f.get('tbr') is not None else -1,
|
|
||||||
f.get('vbr') if f.get('vbr') is not None else -1,
|
|
||||||
f.get('abr') if f.get('abr') is not None else -1,
|
f.get('abr') if f.get('abr') is not None else -1,
|
||||||
audio_ext_preference,
|
audio_ext_preference,
|
||||||
f.get('fps') if f.get('fps') is not None else -1,
|
f.get('fps') if f.get('fps') is not None else -1,
|
||||||
f.get('filesize') if f.get('filesize') is not None else -1,
|
|
||||||
f.get('filesize_approx') if f.get('filesize_approx') is not None else -1,
|
f.get('filesize_approx') if f.get('filesize_approx') is not None else -1,
|
||||||
f.get('source_preference') if f.get('source_preference') is not None else -1,
|
f.get('source_preference') if f.get('source_preference') is not None else -1,
|
||||||
f.get('format_id'),
|
f.get('format_id'),
|
||||||
)
|
)
|
||||||
formats.sort(key=_formats_key)
|
formats.sort(key=_formats_key)
|
||||||
|
|
||||||
|
def _check_formats(self, formats, video_id):
|
||||||
|
if formats:
|
||||||
|
formats[:] = filter(
|
||||||
|
lambda f: self._is_valid_url(
|
||||||
|
f['url'], video_id,
|
||||||
|
item='%s video format' % f.get('format_id') if f.get('format_id') else 'video'),
|
||||||
|
formats)
|
||||||
|
|
||||||
|
def _is_valid_url(self, url, video_id, item='video'):
|
||||||
|
try:
|
||||||
|
self._request_webpage(url, video_id, 'Checking %s URL' % item)
|
||||||
|
return True
|
||||||
|
except ExtractorError as e:
|
||||||
|
if isinstance(e.cause, compat_HTTPError):
|
||||||
|
self.report_warning(
|
||||||
|
'%s URL is invalid, skipping' % item, video_id)
|
||||||
|
return False
|
||||||
|
raise
|
||||||
|
|
||||||
def http_scheme(self):
|
def http_scheme(self):
|
||||||
""" Either "http:" or "https:", depending on the user's preferences """
|
""" Either "http:" or "https:", depending on the user's preferences """
|
||||||
return (
|
return (
|
||||||
@ -728,33 +812,41 @@ class InfoExtractor(object):
|
|||||||
self._sort_formats(formats)
|
self._sort_formats(formats)
|
||||||
return formats
|
return formats
|
||||||
|
|
||||||
def _extract_f4m_formats(self, manifest_url, video_id):
|
def _extract_f4m_formats(self, manifest_url, video_id, preference=None, f4m_id=None):
|
||||||
manifest = self._download_xml(
|
manifest = self._download_xml(
|
||||||
manifest_url, video_id, 'Downloading f4m manifest',
|
manifest_url, video_id, 'Downloading f4m manifest',
|
||||||
'Unable to download f4m manifest')
|
'Unable to download f4m manifest')
|
||||||
|
|
||||||
formats = []
|
formats = []
|
||||||
|
manifest_version = '1.0'
|
||||||
media_nodes = manifest.findall('{http://ns.adobe.com/f4m/1.0}media')
|
media_nodes = manifest.findall('{http://ns.adobe.com/f4m/1.0}media')
|
||||||
|
if not media_nodes:
|
||||||
|
manifest_version = '2.0'
|
||||||
|
media_nodes = manifest.findall('{http://ns.adobe.com/f4m/2.0}media')
|
||||||
for i, media_el in enumerate(media_nodes):
|
for i, media_el in enumerate(media_nodes):
|
||||||
|
if manifest_version == '2.0':
|
||||||
|
manifest_url = ('/'.join(manifest_url.split('/')[:-1]) + '/'
|
||||||
|
+ (media_el.attrib.get('href') or media_el.attrib.get('url')))
|
||||||
tbr = int_or_none(media_el.attrib.get('bitrate'))
|
tbr = int_or_none(media_el.attrib.get('bitrate'))
|
||||||
format_id = 'f4m-%d' % (i if tbr is None else tbr)
|
|
||||||
formats.append({
|
formats.append({
|
||||||
'format_id': format_id,
|
'format_id': '-'.join(filter(None, [f4m_id, 'f4m-%d' % (i if tbr is None else tbr)])),
|
||||||
'url': manifest_url,
|
'url': manifest_url,
|
||||||
'ext': 'flv',
|
'ext': 'flv',
|
||||||
'tbr': tbr,
|
'tbr': tbr,
|
||||||
'width': int_or_none(media_el.attrib.get('width')),
|
'width': int_or_none(media_el.attrib.get('width')),
|
||||||
'height': int_or_none(media_el.attrib.get('height')),
|
'height': int_or_none(media_el.attrib.get('height')),
|
||||||
|
'preference': preference,
|
||||||
})
|
})
|
||||||
self._sort_formats(formats)
|
self._sort_formats(formats)
|
||||||
|
|
||||||
return formats
|
return formats
|
||||||
|
|
||||||
def _extract_m3u8_formats(self, m3u8_url, video_id, ext=None,
|
def _extract_m3u8_formats(self, m3u8_url, video_id, ext=None,
|
||||||
entry_protocol='m3u8', preference=None):
|
entry_protocol='m3u8', preference=None,
|
||||||
|
m3u8_id=None):
|
||||||
|
|
||||||
formats = [{
|
formats = [{
|
||||||
'format_id': 'm3u8-meta',
|
'format_id': '-'.join(filter(None, [m3u8_id, 'm3u8-meta'])),
|
||||||
'url': m3u8_url,
|
'url': m3u8_url,
|
||||||
'ext': ext,
|
'ext': ext,
|
||||||
'protocol': 'm3u8',
|
'protocol': 'm3u8',
|
||||||
@ -773,6 +865,7 @@ class InfoExtractor(object):
|
|||||||
note='Downloading m3u8 information',
|
note='Downloading m3u8 information',
|
||||||
errnote='Failed to download m3u8 information')
|
errnote='Failed to download m3u8 information')
|
||||||
last_info = None
|
last_info = None
|
||||||
|
last_media = None
|
||||||
kv_rex = re.compile(
|
kv_rex = re.compile(
|
||||||
r'(?P<key>[a-zA-Z_-]+)=(?P<val>"[^"]+"|[^",]+)(?:,|$)')
|
r'(?P<key>[a-zA-Z_-]+)=(?P<val>"[^"]+"|[^",]+)(?:,|$)')
|
||||||
for line in m3u8_doc.splitlines():
|
for line in m3u8_doc.splitlines():
|
||||||
@ -783,6 +876,13 @@ class InfoExtractor(object):
|
|||||||
if v.startswith('"'):
|
if v.startswith('"'):
|
||||||
v = v[1:-1]
|
v = v[1:-1]
|
||||||
last_info[m.group('key')] = v
|
last_info[m.group('key')] = v
|
||||||
|
elif line.startswith('#EXT-X-MEDIA:'):
|
||||||
|
last_media = {}
|
||||||
|
for m in kv_rex.finditer(line):
|
||||||
|
v = m.group('val')
|
||||||
|
if v.startswith('"'):
|
||||||
|
v = v[1:-1]
|
||||||
|
last_media[m.group('key')] = v
|
||||||
elif line.startswith('#') or not line.strip():
|
elif line.startswith('#') or not line.strip():
|
||||||
continue
|
continue
|
||||||
else:
|
else:
|
||||||
@ -790,9 +890,8 @@ class InfoExtractor(object):
|
|||||||
formats.append({'url': format_url(line)})
|
formats.append({'url': format_url(line)})
|
||||||
continue
|
continue
|
||||||
tbr = int_or_none(last_info.get('BANDWIDTH'), scale=1000)
|
tbr = int_or_none(last_info.get('BANDWIDTH'), scale=1000)
|
||||||
|
|
||||||
f = {
|
f = {
|
||||||
'format_id': 'm3u8-%d' % (tbr if tbr else len(formats)),
|
'format_id': '-'.join(filter(None, [m3u8_id, 'm3u8-%d' % (tbr if tbr else len(formats))])),
|
||||||
'url': format_url(line.strip()),
|
'url': format_url(line.strip()),
|
||||||
'tbr': tbr,
|
'tbr': tbr,
|
||||||
'ext': ext,
|
'ext': ext,
|
||||||
@ -812,11 +911,60 @@ class InfoExtractor(object):
|
|||||||
width_str, height_str = resolution.split('x')
|
width_str, height_str = resolution.split('x')
|
||||||
f['width'] = int(width_str)
|
f['width'] = int(width_str)
|
||||||
f['height'] = int(height_str)
|
f['height'] = int(height_str)
|
||||||
|
if last_media is not None:
|
||||||
|
f['m3u8_media'] = last_media
|
||||||
|
last_media = None
|
||||||
formats.append(f)
|
formats.append(f)
|
||||||
last_info = {}
|
last_info = {}
|
||||||
self._sort_formats(formats)
|
self._sort_formats(formats)
|
||||||
return formats
|
return formats
|
||||||
|
|
||||||
|
# TODO: improve extraction
|
||||||
|
def _extract_smil_formats(self, smil_url, video_id, fatal=True):
|
||||||
|
smil = self._download_xml(
|
||||||
|
smil_url, video_id, 'Downloading SMIL file',
|
||||||
|
'Unable to download SMIL file', fatal=fatal)
|
||||||
|
if smil is False:
|
||||||
|
assert not fatal
|
||||||
|
return []
|
||||||
|
|
||||||
|
base = smil.find('./head/meta').get('base')
|
||||||
|
|
||||||
|
formats = []
|
||||||
|
rtmp_count = 0
|
||||||
|
for video in smil.findall('./body/switch/video'):
|
||||||
|
src = video.get('src')
|
||||||
|
if not src:
|
||||||
|
continue
|
||||||
|
bitrate = int_or_none(video.get('system-bitrate') or video.get('systemBitrate'), 1000)
|
||||||
|
width = int_or_none(video.get('width'))
|
||||||
|
height = int_or_none(video.get('height'))
|
||||||
|
proto = video.get('proto')
|
||||||
|
if not proto:
|
||||||
|
if base:
|
||||||
|
if base.startswith('rtmp'):
|
||||||
|
proto = 'rtmp'
|
||||||
|
elif base.startswith('http'):
|
||||||
|
proto = 'http'
|
||||||
|
ext = video.get('ext')
|
||||||
|
if proto == 'm3u8':
|
||||||
|
formats.extend(self._extract_m3u8_formats(src, video_id, ext))
|
||||||
|
elif proto == 'rtmp':
|
||||||
|
rtmp_count += 1
|
||||||
|
streamer = video.get('streamer') or base
|
||||||
|
formats.append({
|
||||||
|
'url': streamer,
|
||||||
|
'play_path': src,
|
||||||
|
'ext': 'flv',
|
||||||
|
'format_id': 'rtmp-%d' % (rtmp_count if bitrate is None else bitrate),
|
||||||
|
'tbr': bitrate,
|
||||||
|
'width': width,
|
||||||
|
'height': height,
|
||||||
|
})
|
||||||
|
self._sort_formats(formats)
|
||||||
|
|
||||||
|
return formats
|
||||||
|
|
||||||
def _live_title(self, name):
|
def _live_title(self, name):
|
||||||
""" Generate the title for a live video """
|
""" Generate the title for a live video """
|
||||||
now = datetime.datetime.now()
|
now = datetime.datetime.now()
|
||||||
@ -846,10 +994,40 @@ class InfoExtractor(object):
|
|||||||
return res
|
return res
|
||||||
|
|
||||||
def _set_cookie(self, domain, name, value, expire_time=None):
|
def _set_cookie(self, domain, name, value, expire_time=None):
|
||||||
cookie = compat_cookiejar.Cookie(0, name, value, None, None, domain, None,
|
cookie = compat_cookiejar.Cookie(
|
||||||
|
0, name, value, None, None, domain, None,
|
||||||
None, '/', True, False, expire_time, '', None, None, None)
|
None, '/', True, False, expire_time, '', None, None, None)
|
||||||
self._downloader.cookiejar.set_cookie(cookie)
|
self._downloader.cookiejar.set_cookie(cookie)
|
||||||
|
|
||||||
|
def get_testcases(self, include_onlymatching=False):
|
||||||
|
t = getattr(self, '_TEST', None)
|
||||||
|
if t:
|
||||||
|
assert not hasattr(self, '_TESTS'), \
|
||||||
|
'%s has _TEST and _TESTS' % type(self).__name__
|
||||||
|
tests = [t]
|
||||||
|
else:
|
||||||
|
tests = getattr(self, '_TESTS', [])
|
||||||
|
for t in tests:
|
||||||
|
if not include_onlymatching and t.get('only_matching', False):
|
||||||
|
continue
|
||||||
|
t['name'] = type(self).__name__[:-len('IE')]
|
||||||
|
yield t
|
||||||
|
|
||||||
|
def is_suitable(self, age_limit):
|
||||||
|
""" Test whether the extractor is generally suitable for the given
|
||||||
|
age limit (i.e. pornographic sites are not, all others usually are) """
|
||||||
|
|
||||||
|
any_restricted = False
|
||||||
|
for tc in self.get_testcases(include_onlymatching=False):
|
||||||
|
if 'playlist' in tc:
|
||||||
|
tc = tc['playlist'][0]
|
||||||
|
is_restricted = age_restricted(
|
||||||
|
tc.get('info_dict', {}).get('age_limit'), age_limit)
|
||||||
|
if not is_restricted:
|
||||||
|
return True
|
||||||
|
any_restricted = any_restricted or is_restricted
|
||||||
|
return not any_restricted
|
||||||
|
|
||||||
|
|
||||||
class SearchInfoExtractor(InfoExtractor):
|
class SearchInfoExtractor(InfoExtractor):
|
||||||
"""
|
"""
|
||||||
|
46
youtube_dl/extractor/commonmistakes.py
Normal file
46
youtube_dl/extractor/commonmistakes.py
Normal file
@ -0,0 +1,46 @@
|
|||||||
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
from .common import InfoExtractor
|
||||||
|
from ..utils import ExtractorError
|
||||||
|
|
||||||
|
|
||||||
|
class CommonMistakesIE(InfoExtractor):
|
||||||
|
IE_DESC = False # Do not list
|
||||||
|
_VALID_URL = r'''(?x)
|
||||||
|
(?:url|URL)
|
||||||
|
'''
|
||||||
|
|
||||||
|
_TESTS = [{
|
||||||
|
'url': 'url',
|
||||||
|
'only_matching': True,
|
||||||
|
}, {
|
||||||
|
'url': 'URL',
|
||||||
|
'only_matching': True,
|
||||||
|
}]
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
msg = (
|
||||||
|
'You\'ve asked youtube-dl to download the URL "%s". '
|
||||||
|
'That doesn\'t make any sense. '
|
||||||
|
'Simply remove the parameter in your command or configuration.'
|
||||||
|
) % url
|
||||||
|
if not self._downloader.params.get('verbose'):
|
||||||
|
msg += ' Add -v to the command line to see what arguments and configuration youtube-dl got.'
|
||||||
|
raise ExtractorError(msg, expected=True)
|
||||||
|
|
||||||
|
|
||||||
|
class UnicodeBOMIE(InfoExtractor):
|
||||||
|
IE_DESC = False
|
||||||
|
_VALID_URL = r'(?P<bom>\ufeff)(?P<id>.*)$'
|
||||||
|
|
||||||
|
_TESTS = [{
|
||||||
|
'url': '\ufeffhttp://www.youtube.com/watch?v=BaW_jenozKc',
|
||||||
|
'only_matching': True,
|
||||||
|
}]
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
real_url = self._match_id(url)
|
||||||
|
self.report_warning(
|
||||||
|
'Your URL starts with a Byte Order Mark (BOM). '
|
||||||
|
'Removing the BOM and looking for "%s" ...' % real_url)
|
||||||
|
return self.url_result(real_url)
|
@ -5,12 +5,14 @@ import re
|
|||||||
import json
|
import json
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from ..utils import (
|
from ..compat import (
|
||||||
compat_urllib_parse,
|
compat_urllib_parse,
|
||||||
orderedSet,
|
|
||||||
compat_urllib_parse_urlparse,
|
compat_urllib_parse_urlparse,
|
||||||
compat_urlparse,
|
compat_urlparse,
|
||||||
)
|
)
|
||||||
|
from ..utils import (
|
||||||
|
orderedSet,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
class CondeNastIE(InfoExtractor):
|
class CondeNastIE(InfoExtractor):
|
||||||
|
@ -10,10 +10,12 @@ import xml.etree.ElementTree
|
|||||||
from hashlib import sha1
|
from hashlib import sha1
|
||||||
from math import pow, sqrt, floor
|
from math import pow, sqrt, floor
|
||||||
from .subtitles import SubtitlesInfoExtractor
|
from .subtitles import SubtitlesInfoExtractor
|
||||||
from ..utils import (
|
from ..compat import (
|
||||||
ExtractorError,
|
|
||||||
compat_urllib_parse,
|
compat_urllib_parse,
|
||||||
compat_urllib_request,
|
compat_urllib_request,
|
||||||
|
)
|
||||||
|
from ..utils import (
|
||||||
|
ExtractorError,
|
||||||
bytes_to_intlist,
|
bytes_to_intlist,
|
||||||
intlist_to_bytes,
|
intlist_to_bytes,
|
||||||
unified_strdate,
|
unified_strdate,
|
||||||
@ -27,10 +29,9 @@ from .common import InfoExtractor
|
|||||||
|
|
||||||
|
|
||||||
class CrunchyrollIE(SubtitlesInfoExtractor):
|
class CrunchyrollIE(SubtitlesInfoExtractor):
|
||||||
_VALID_URL = r'https?://(?:(?P<prefix>www|m)\.)?(?P<url>crunchyroll\.com/(?:[^/]*/[^/?&]*?|media/\?id=)(?P<video_id>[0-9]+))(?:[/?&]|$)'
|
_VALID_URL = r'https?://(?:(?P<prefix>www|m)\.)?(?P<url>crunchyroll\.(?:com|fr)/(?:[^/]*/[^/?&]*?|media/\?id=)(?P<video_id>[0-9]+))(?:[/?&]|$)'
|
||||||
_TEST = {
|
_TESTS = [{
|
||||||
'url': 'http://www.crunchyroll.com/wanna-be-the-strongest-in-the-world/episode-1-an-idol-wrestler-is-born-645513',
|
'url': 'http://www.crunchyroll.com/wanna-be-the-strongest-in-the-world/episode-1-an-idol-wrestler-is-born-645513',
|
||||||
#'md5': 'b1639fd6ddfaa43788c85f6d1dddd412',
|
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '645513',
|
'id': '645513',
|
||||||
'ext': 'flv',
|
'ext': 'flv',
|
||||||
@ -45,7 +46,10 @@ class CrunchyrollIE(SubtitlesInfoExtractor):
|
|||||||
# rtmp
|
# rtmp
|
||||||
'skip_download': True,
|
'skip_download': True,
|
||||||
},
|
},
|
||||||
}
|
}, {
|
||||||
|
'url': 'http://www.crunchyroll.fr/girl-friend-beta/episode-11-goodbye-la-mode-661697',
|
||||||
|
'only_matching': True,
|
||||||
|
}]
|
||||||
|
|
||||||
_FORMAT_IDS = {
|
_FORMAT_IDS = {
|
||||||
'360': ('60', '106'),
|
'360': ('60', '106'),
|
||||||
@ -224,7 +228,7 @@ Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
|
|||||||
video_thumbnail = self._search_regex(r'<episode_image_url>([^<]+)', playerdata, 'thumbnail', fatal=False)
|
video_thumbnail = self._search_regex(r'<episode_image_url>([^<]+)', playerdata, 'thumbnail', fatal=False)
|
||||||
|
|
||||||
formats = []
|
formats = []
|
||||||
for fmt in re.findall(r'\?p([0-9]{3,4})=1', webpage):
|
for fmt in re.findall(r'showmedia\.([0-9]{3,4})p', webpage):
|
||||||
stream_quality, stream_format = self._FORMAT_IDS[fmt]
|
stream_quality, stream_format = self._FORMAT_IDS[fmt]
|
||||||
video_format = fmt + 'p'
|
video_format = fmt + 'p'
|
||||||
streamdata_req = compat_urllib_request.Request('http://www.crunchyroll.com/xml/')
|
streamdata_req = compat_urllib_request.Request('http://www.crunchyroll.com/xml/')
|
||||||
|
@ -27,7 +27,6 @@ class CSpanIE(InfoExtractor):
|
|||||||
'url': 'http://www.c-span.org/video/?c4486943/cspan-international-health-care-models',
|
'url': 'http://www.c-span.org/video/?c4486943/cspan-international-health-care-models',
|
||||||
# For whatever reason, the served video alternates between
|
# For whatever reason, the served video alternates between
|
||||||
# two different ones
|
# two different ones
|
||||||
#'md5': 'dbb0f047376d457f2ab8b3929cbb2d0c',
|
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '340723',
|
'id': '340723',
|
||||||
'ext': 'mp4',
|
'ext': 'mp4',
|
||||||
|
93
youtube_dl/extractor/ctsnews.py
Normal file
93
youtube_dl/extractor/ctsnews.py
Normal file
@ -0,0 +1,93 @@
|
|||||||
|
# -*- coding: utf-8 -*-
|
||||||
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
from .common import InfoExtractor
|
||||||
|
from ..utils import parse_iso8601, ExtractorError
|
||||||
|
|
||||||
|
|
||||||
|
class CtsNewsIE(InfoExtractor):
|
||||||
|
# https connection failed (Connection reset)
|
||||||
|
_VALID_URL = r'http://news\.cts\.com\.tw/[a-z]+/[a-z]+/\d+/(?P<id>\d+)\.html'
|
||||||
|
_TESTS = [{
|
||||||
|
'url': 'http://news.cts.com.tw/cts/international/201501/201501291578109.html',
|
||||||
|
'md5': 'a9875cb790252b08431186d741beaabe',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '201501291578109',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': '以色列.真主黨交火 3人死亡',
|
||||||
|
'description': 'md5:95e9b295c898b7ff294f09d450178d7d',
|
||||||
|
'timestamp': 1422528540,
|
||||||
|
'upload_date': '20150129',
|
||||||
|
}
|
||||||
|
}, {
|
||||||
|
# News count not appear on page but still available in database
|
||||||
|
'url': 'http://news.cts.com.tw/cts/international/201309/201309031304098.html',
|
||||||
|
'md5': '3aee7e0df7cdff94e43581f54c22619e',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '201309031304098',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': '韓國31歲童顏男 貌如十多歲小孩',
|
||||||
|
'description': 'md5:f183feeba3752b683827aab71adad584',
|
||||||
|
'thumbnail': 're:^https?://.*\.jpg$',
|
||||||
|
'timestamp': 1378205880,
|
||||||
|
'upload_date': '20130903',
|
||||||
|
}
|
||||||
|
}, {
|
||||||
|
# With Youtube embedded video
|
||||||
|
'url': 'http://news.cts.com.tw/cts/money/201501/201501291578003.html',
|
||||||
|
'md5': '1d842c771dc94c8c3bca5af2cc1db9c5',
|
||||||
|
'add_ie': ['Youtube'],
|
||||||
|
'info_dict': {
|
||||||
|
'id': 'OVbfO7d0_hQ',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'iPhone6熱銷 蘋果財報亮眼',
|
||||||
|
'description': 'md5:f395d4f485487bb0f992ed2c4b07aa7d',
|
||||||
|
'thumbnail': 're:^https?://.*\.jpg$',
|
||||||
|
'upload_date': '20150128',
|
||||||
|
'uploader_id': 'TBSCTS',
|
||||||
|
'uploader': '中華電視公司',
|
||||||
|
}
|
||||||
|
}]
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
news_id = self._match_id(url)
|
||||||
|
page = self._download_webpage(url, news_id)
|
||||||
|
|
||||||
|
if self._search_regex(r'(CTSPlayer2)', page, 'CTSPlayer2 identifier', default=None):
|
||||||
|
feed_url = self._html_search_regex(
|
||||||
|
r'(http://news\.cts\.com\.tw/action/mp4feed\.php\?news_id=\d+)',
|
||||||
|
page, 'feed url')
|
||||||
|
video_url = self._download_webpage(
|
||||||
|
feed_url, news_id, note='Fetching feed')
|
||||||
|
else:
|
||||||
|
self.to_screen('Not CTSPlayer video, trying Youtube...')
|
||||||
|
youtube_url = self._search_regex(
|
||||||
|
r'src="(//www\.youtube\.com/embed/[^"]+)"', page, 'youtube url',
|
||||||
|
default=None)
|
||||||
|
if not youtube_url:
|
||||||
|
raise ExtractorError('The news includes no videos!', expected=True)
|
||||||
|
|
||||||
|
return {
|
||||||
|
'_type': 'url',
|
||||||
|
'url': youtube_url,
|
||||||
|
'ie_key': 'Youtube',
|
||||||
|
}
|
||||||
|
|
||||||
|
description = self._html_search_meta('description', page)
|
||||||
|
title = self._html_search_meta('title', page)
|
||||||
|
thumbnail = self._html_search_meta('image', page)
|
||||||
|
|
||||||
|
datetime_str = self._html_search_regex(
|
||||||
|
r'(\d{4}/\d{2}/\d{2} \d{2}:\d{2})', page, 'date and time')
|
||||||
|
# Transform into ISO 8601 format with timezone info
|
||||||
|
datetime_str = datetime_str.replace('/', '-') + ':00+0800'
|
||||||
|
timestamp = parse_iso8601(datetime_str, delimiter=' ')
|
||||||
|
|
||||||
|
return {
|
||||||
|
'id': news_id,
|
||||||
|
'url': video_url,
|
||||||
|
'title': title,
|
||||||
|
'description': description,
|
||||||
|
'thumbnail': thumbnail,
|
||||||
|
'timestamp': timestamp,
|
||||||
|
}
|
@ -8,13 +8,15 @@ import itertools
|
|||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from .subtitles import SubtitlesInfoExtractor
|
from .subtitles import SubtitlesInfoExtractor
|
||||||
|
|
||||||
from ..utils import (
|
from ..compat import (
|
||||||
compat_urllib_request,
|
|
||||||
compat_str,
|
compat_str,
|
||||||
|
compat_urllib_request,
|
||||||
|
)
|
||||||
|
from ..utils import (
|
||||||
|
ExtractorError,
|
||||||
|
int_or_none,
|
||||||
orderedSet,
|
orderedSet,
|
||||||
str_to_int,
|
str_to_int,
|
||||||
int_or_none,
|
|
||||||
ExtractorError,
|
|
||||||
unescapeHTML,
|
unescapeHTML,
|
||||||
)
|
)
|
||||||
|
|
||||||
@ -192,6 +194,7 @@ class DailymotionPlaylistIE(DailymotionBaseInfoExtractor):
|
|||||||
'url': 'http://www.dailymotion.com/playlist/xv4bw_nqtv_sport/1#video=xl8v3q',
|
'url': 'http://www.dailymotion.com/playlist/xv4bw_nqtv_sport/1#video=xl8v3q',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'title': 'SPORT',
|
'title': 'SPORT',
|
||||||
|
'id': 'xv4bw_nqtv_sport',
|
||||||
},
|
},
|
||||||
'playlist_mincount': 20,
|
'playlist_mincount': 20,
|
||||||
}]
|
}]
|
||||||
|
@ -5,7 +5,7 @@ from __future__ import unicode_literals
|
|||||||
import re
|
import re
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
from ..utils import (
|
from ..compat import (
|
||||||
compat_urllib_parse,
|
compat_urllib_parse,
|
||||||
)
|
)
|
||||||
|
|
||||||
@ -38,7 +38,7 @@ class DaumIE(InfoExtractor):
|
|||||||
canonical_url = 'http://tvpot.daum.net/v/%s' % video_id
|
canonical_url = 'http://tvpot.daum.net/v/%s' % video_id
|
||||||
webpage = self._download_webpage(canonical_url, video_id)
|
webpage = self._download_webpage(canonical_url, video_id)
|
||||||
full_id = self._search_regex(
|
full_id = self._search_regex(
|
||||||
r'<iframe src="http://videofarm.daum.net/controller/video/viewer/Video.html\?.*?vid=(.+?)[&"]',
|
r'src=["\']http://videofarm\.daum\.net/controller/video/viewer/Video\.html\?.*?vid=(.+?)[&"\']',
|
||||||
webpage, 'full id')
|
webpage, 'full id')
|
||||||
query = compat_urllib_parse.urlencode({'vid': full_id})
|
query = compat_urllib_parse.urlencode({'vid': full_id})
|
||||||
info = self._download_xml(
|
info = self._download_xml(
|
||||||
|
@ -4,6 +4,7 @@ from __future__ import unicode_literals
|
|||||||
import re
|
import re
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
|
from ..compat import compat_str
|
||||||
from ..utils import (
|
from ..utils import (
|
||||||
float_or_none,
|
float_or_none,
|
||||||
int_or_none,
|
int_or_none,
|
||||||
@ -61,7 +62,7 @@ class DBTVIE(InfoExtractor):
|
|||||||
self._sort_formats(formats)
|
self._sort_formats(formats)
|
||||||
|
|
||||||
return {
|
return {
|
||||||
'id': video['id'],
|
'id': compat_str(video['id']),
|
||||||
'display_id': display_id,
|
'display_id': display_id,
|
||||||
'title': video['title'],
|
'title': video['title'],
|
||||||
'description': clean_html(video['desc']),
|
'description': clean_html(video['desc']),
|
||||||
|
61
youtube_dl/extractor/dctp.py
Normal file
61
youtube_dl/extractor/dctp.py
Normal file
@ -0,0 +1,61 @@
|
|||||||
|
# encoding: utf-8
|
||||||
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
from .common import InfoExtractor
|
||||||
|
from ..compat import compat_str
|
||||||
|
|
||||||
|
|
||||||
|
class DctpTvIE(InfoExtractor):
|
||||||
|
_VALID_URL = r'http://www.dctp.tv/(#/)?filme/(?P<id>.+?)/$'
|
||||||
|
_TEST = {
|
||||||
|
'url': 'http://www.dctp.tv/filme/videoinstallation-fuer-eine-kaufhausfassade/',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '1324',
|
||||||
|
'display_id': 'videoinstallation-fuer-eine-kaufhausfassade',
|
||||||
|
'ext': 'flv',
|
||||||
|
'title': 'Videoinstallation für eine Kaufhausfassade'
|
||||||
|
},
|
||||||
|
'params': {
|
||||||
|
# rtmp download
|
||||||
|
'skip_download': True,
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
video_id = self._match_id(url)
|
||||||
|
base_url = 'http://dctp-ivms2-restapi.s3.amazonaws.com/'
|
||||||
|
version_json = self._download_json(
|
||||||
|
base_url + 'version.json',
|
||||||
|
video_id, note='Determining file version')
|
||||||
|
version = version_json['version_name']
|
||||||
|
info_json = self._download_json(
|
||||||
|
'{0}{1}/restapi/slugs/{2}.json'.format(base_url, version, video_id),
|
||||||
|
video_id, note='Fetching object ID')
|
||||||
|
object_id = compat_str(info_json['object_id'])
|
||||||
|
meta_json = self._download_json(
|
||||||
|
'{0}{1}/restapi/media/{2}.json'.format(base_url, version, object_id),
|
||||||
|
video_id, note='Downloading metadata')
|
||||||
|
uuid = meta_json['uuid']
|
||||||
|
title = meta_json['title']
|
||||||
|
wide = meta_json['is_wide']
|
||||||
|
if wide:
|
||||||
|
ratio = '16x9'
|
||||||
|
else:
|
||||||
|
ratio = '4x3'
|
||||||
|
play_path = 'mp4:{0}_dctp_0500_{1}.m4v'.format(uuid, ratio)
|
||||||
|
|
||||||
|
servers_json = self._download_json(
|
||||||
|
'http://www.dctp.tv/streaming_servers/',
|
||||||
|
video_id, note='Downloading server list')
|
||||||
|
url = servers_json[0]['endpoint']
|
||||||
|
|
||||||
|
return {
|
||||||
|
'id': object_id,
|
||||||
|
'title': title,
|
||||||
|
'format': 'rtmp',
|
||||||
|
'url': url,
|
||||||
|
'play_path': play_path,
|
||||||
|
'rtmp_real_time': True,
|
||||||
|
'ext': 'flv',
|
||||||
|
'display_id': video_id
|
||||||
|
}
|
@ -1,40 +1,38 @@
|
|||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
import re
|
|
||||||
import json
|
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
|
|
||||||
|
|
||||||
class DefenseGouvFrIE(InfoExtractor):
|
class DefenseGouvFrIE(InfoExtractor):
|
||||||
IE_NAME = 'defense.gouv.fr'
|
IE_NAME = 'defense.gouv.fr'
|
||||||
_VALID_URL = (r'http://.*?\.defense\.gouv\.fr/layout/set/'
|
_VALID_URL = r'http://.*?\.defense\.gouv\.fr/layout/set/ligthboxvideo/base-de-medias/webtv/(?P<id>[^/?#]*)'
|
||||||
r'ligthboxvideo/base-de-medias/webtv/(.*)')
|
|
||||||
|
|
||||||
_TEST = {
|
_TEST = {
|
||||||
'url': 'http://www.defense.gouv.fr/layout/set/ligthboxvideo/base-de-medias/webtv/attaque-chimique-syrienne-du-21-aout-2013-1',
|
'url': 'http://www.defense.gouv.fr/layout/set/ligthboxvideo/base-de-medias/webtv/attaque-chimique-syrienne-du-21-aout-2013-1',
|
||||||
'file': '11213.mp4',
|
|
||||||
'md5': '75bba6124da7e63d2d60b5244ec9430c',
|
'md5': '75bba6124da7e63d2d60b5244ec9430c',
|
||||||
"info_dict": {
|
'info_dict': {
|
||||||
"title": "attaque-chimique-syrienne-du-21-aout-2013-1"
|
'id': '11213',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'attaque-chimique-syrienne-du-21-aout-2013-1'
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
title = re.match(self._VALID_URL, url).group(1)
|
title = self._match_id(url)
|
||||||
webpage = self._download_webpage(url, title)
|
webpage = self._download_webpage(url, title)
|
||||||
|
|
||||||
video_id = self._search_regex(
|
video_id = self._search_regex(
|
||||||
r"flashvars.pvg_id=\"(\d+)\";",
|
r"flashvars.pvg_id=\"(\d+)\";",
|
||||||
webpage, 'ID')
|
webpage, 'ID')
|
||||||
|
|
||||||
json_url = ('http://static.videos.gouv.fr/brightcovehub/export/json/'
|
json_url = ('http://static.videos.gouv.fr/brightcovehub/export/json/'
|
||||||
+ video_id)
|
+ video_id)
|
||||||
info = self._download_webpage(json_url, title,
|
info = self._download_json(json_url, title, 'Downloading JSON config')
|
||||||
'Downloading JSON config')
|
video_url = info['renditions'][0]['url']
|
||||||
video_url = json.loads(info)['renditions'][0]['url']
|
|
||||||
|
|
||||||
return {'id': video_id,
|
return {
|
||||||
'ext': 'mp4',
|
'id': video_id,
|
||||||
'url': video_url,
|
'ext': 'mp4',
|
||||||
'title': title,
|
'url': video_url,
|
||||||
}
|
'title': title,
|
||||||
|
}
|
||||||
|
@ -1,47 +1,45 @@
|
|||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
import re
|
|
||||||
import json
|
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
|
from ..utils import (
|
||||||
|
parse_iso8601,
|
||||||
|
int_or_none,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
class DiscoveryIE(InfoExtractor):
|
class DiscoveryIE(InfoExtractor):
|
||||||
_VALID_URL = r'http://www\.discovery\.com\/[a-zA-Z0-9\-]*/[a-zA-Z0-9\-]*/videos/(?P<id>[a-zA-Z0-9\-]*)(.htm)?'
|
_VALID_URL = r'http://www\.discovery\.com\/[a-zA-Z0-9\-]*/[a-zA-Z0-9\-]*/videos/(?P<id>[a-zA-Z0-9_\-]*)(?:\.htm)?'
|
||||||
_TEST = {
|
_TEST = {
|
||||||
'url': 'http://www.discovery.com/tv-shows/mythbusters/videos/mission-impossible-outtakes.htm',
|
'url': 'http://www.discovery.com/tv-shows/mythbusters/videos/mission-impossible-outtakes.htm',
|
||||||
'md5': 'e12614f9ee303a6ccef415cb0793eba2',
|
'md5': '3c69d77d9b0d82bfd5e5932a60f26504',
|
||||||
'info_dict': {
|
'info_dict': {
|
||||||
'id': '614784',
|
'id': 'mission-impossible-outtakes',
|
||||||
'ext': 'mp4',
|
'ext': 'flv',
|
||||||
'title': 'MythBusters: Mission Impossible Outtakes',
|
'title': 'Mission Impossible Outtakes',
|
||||||
'description': ('Watch Jamie Hyneman and Adam Savage practice being'
|
'description': ('Watch Jamie Hyneman and Adam Savage practice being'
|
||||||
' each other -- to the point of confusing Jamie\'s dog -- and '
|
' each other -- to the point of confusing Jamie\'s dog -- and '
|
||||||
'don\'t miss Adam moon-walking as Jamie ... behind Jamie\'s'
|
'don\'t miss Adam moon-walking as Jamie ... behind Jamie\'s'
|
||||||
' back.'),
|
' back.'),
|
||||||
'duration': 156,
|
'duration': 156,
|
||||||
|
'timestamp': 1303099200,
|
||||||
|
'upload_date': '20110418',
|
||||||
},
|
},
|
||||||
}
|
}
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
mobj = re.match(self._VALID_URL, url)
|
video_id = self._match_id(url)
|
||||||
video_id = mobj.group('id')
|
|
||||||
webpage = self._download_webpage(url, video_id)
|
webpage = self._download_webpage(url, video_id)
|
||||||
|
|
||||||
video_list_json = self._search_regex(r'var videoListJSON = ({.*?});',
|
info = self._parse_json(self._search_regex(
|
||||||
webpage, 'video list', flags=re.DOTALL)
|
r'(?s)<script type="application/ld\+json">(.*?)</script>',
|
||||||
video_list = json.loads(video_list_json)
|
webpage, 'video info'), video_id)
|
||||||
info = video_list['clips'][0]
|
|
||||||
formats = []
|
|
||||||
for f in info['mp4']:
|
|
||||||
formats.append(
|
|
||||||
{'url': f['src'], 'ext': 'mp4', 'tbr': int(f['bitrate'][:-1])})
|
|
||||||
|
|
||||||
return {
|
return {
|
||||||
'id': info['contentId'],
|
'id': video_id,
|
||||||
'title': video_list['name'],
|
'title': info['name'],
|
||||||
'formats': formats,
|
'url': info['contentURL'],
|
||||||
'description': info['videoCaption'],
|
'description': info.get('description'),
|
||||||
'thumbnail': info.get('videoStillURL') or info.get('thumbnailURL'),
|
'thumbnail': info.get('thumbnailUrl'),
|
||||||
'duration': info['duration'],
|
'timestamp': parse_iso8601(info.get('uploadDate')),
|
||||||
|
'duration': int_or_none(info.get('duration')),
|
||||||
}
|
}
|
||||||
|
@ -1,13 +1,14 @@
|
|||||||
from __future__ import unicode_literals
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
import re
|
|
||||||
import time
|
|
||||||
|
|
||||||
from .common import InfoExtractor
|
from .common import InfoExtractor
|
||||||
|
from ..utils import (
|
||||||
|
float_or_none,
|
||||||
|
int_or_none,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
class DotsubIE(InfoExtractor):
|
class DotsubIE(InfoExtractor):
|
||||||
_VALID_URL = r'http://(?:www\.)?dotsub\.com/view/(?P<id>[^/]+)'
|
_VALID_URL = r'https?://(?:www\.)?dotsub\.com/view/(?P<id>[^/]+)'
|
||||||
_TEST = {
|
_TEST = {
|
||||||
'url': 'http://dotsub.com/view/aed3b8b2-1889-4df5-ae63-ad85f5572f27',
|
'url': 'http://dotsub.com/view/aed3b8b2-1889-4df5-ae63-ad85f5572f27',
|
||||||
'md5': '0914d4d69605090f623b7ac329fea66e',
|
'md5': '0914d4d69605090f623b7ac329fea66e',
|
||||||
@ -15,28 +16,37 @@ class DotsubIE(InfoExtractor):
|
|||||||
'id': 'aed3b8b2-1889-4df5-ae63-ad85f5572f27',
|
'id': 'aed3b8b2-1889-4df5-ae63-ad85f5572f27',
|
||||||
'ext': 'flv',
|
'ext': 'flv',
|
||||||
'title': 'Pyramids of Waste (2010), AKA The Lightbulb Conspiracy - Planned obsolescence documentary',
|
'title': 'Pyramids of Waste (2010), AKA The Lightbulb Conspiracy - Planned obsolescence documentary',
|
||||||
|
'description': 'md5:699a0f7f50aeec6042cb3b1db2d0d074',
|
||||||
|
'thumbnail': 're:^https?://dotsub.com/media/aed3b8b2-1889-4df5-ae63-ad85f5572f27/p',
|
||||||
|
'duration': 3169,
|
||||||
'uploader': '4v4l0n42',
|
'uploader': '4v4l0n42',
|
||||||
'description': 'Pyramids of Waste (2010) also known as "The lightbulb conspiracy" is a documentary about how our economic system based on consumerism and planned obsolescence is breaking our planet down.\r\n\r\nSolutions to this can be found at:\r\nhttp://robotswillstealyourjob.com\r\nhttp://www.federicopistono.org\r\n\r\nhttp://opensourceecology.org\r\nhttp://thezeitgeistmovement.com',
|
'timestamp': 1292248482.625,
|
||||||
'thumbnail': 'http://dotsub.com/media/aed3b8b2-1889-4df5-ae63-ad85f5572f27/p',
|
|
||||||
'upload_date': '20101213',
|
'upload_date': '20101213',
|
||||||
|
'view_count': int,
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
def _real_extract(self, url):
|
def _real_extract(self, url):
|
||||||
mobj = re.match(self._VALID_URL, url)
|
video_id = self._match_id(url)
|
||||||
video_id = mobj.group('id')
|
|
||||||
info_url = "https://dotsub.com/api/media/%s/metadata" % video_id
|
info = self._download_json(
|
||||||
info = self._download_json(info_url, video_id)
|
'https://dotsub.com/api/media/%s/metadata' % video_id, video_id)
|
||||||
date = time.gmtime(info['dateCreated'] / 1000) # The timestamp is in miliseconds
|
video_url = info.get('mediaURI')
|
||||||
|
|
||||||
|
if not video_url:
|
||||||
|
webpage = self._download_webpage(url, video_id)
|
||||||
|
video_url = self._search_regex(
|
||||||
|
r'"file"\s*:\s*\'([^\']+)', webpage, 'video url')
|
||||||
|
|
||||||
return {
|
return {
|
||||||
'id': video_id,
|
'id': video_id,
|
||||||
'url': info['mediaURI'],
|
'url': video_url,
|
||||||
'ext': 'flv',
|
'ext': 'flv',
|
||||||
'title': info['title'],
|
'title': info['title'],
|
||||||
'thumbnail': info['screenshotURI'],
|
'description': info.get('description'),
|
||||||
'description': info['description'],
|
'thumbnail': info.get('screenshotURI'),
|
||||||
'uploader': info['user'],
|
'duration': int_or_none(info.get('duration'), 1000),
|
||||||
'view_count': info['numberOfViews'],
|
'uploader': info.get('user'),
|
||||||
'upload_date': '%04i%02i%02i' % (date.tm_year, date.tm_mon, date.tm_mday),
|
'timestamp': float_or_none(info.get('dateCreated'), 1000),
|
||||||
|
'view_count': int_or_none(info.get('numberOfViews')),
|
||||||
}
|
}
|
||||||
|
131
youtube_dl/extractor/drbonanza.py
Normal file
131
youtube_dl/extractor/drbonanza.py
Normal file
@ -0,0 +1,131 @@
|
|||||||
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
import json
|
||||||
|
import re
|
||||||
|
|
||||||
|
from .common import InfoExtractor
|
||||||
|
from ..utils import (
|
||||||
|
int_or_none,
|
||||||
|
parse_iso8601,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
class DRBonanzaIE(InfoExtractor):
|
||||||
|
_VALID_URL = r'https?://(?:www\.)?dr\.dk/bonanza/(?:[^/]+/)+(?:[^/])+?(?:assetId=(?P<id>\d+))?(?:[#&]|$)'
|
||||||
|
|
||||||
|
_TESTS = [{
|
||||||
|
'url': 'http://www.dr.dk/bonanza/serie/portraetter/Talkshowet.htm?assetId=65517',
|
||||||
|
'md5': 'fe330252ddea607635cf2eb2c99a0af3',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '65517',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'Talkshowet - Leonard Cohen',
|
||||||
|
'description': 'md5:8f34194fb30cd8c8a30ad8b27b70c0ca',
|
||||||
|
'thumbnail': 're:^https?://.*\.(?:gif|jpg)$',
|
||||||
|
'timestamp': 1295537932,
|
||||||
|
'upload_date': '20110120',
|
||||||
|
'duration': 3664,
|
||||||
|
},
|
||||||
|
}, {
|
||||||
|
'url': 'http://www.dr.dk/bonanza/radio/serie/sport/fodbold.htm?assetId=59410',
|
||||||
|
'md5': '6dfe039417e76795fb783c52da3de11d',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '59410',
|
||||||
|
'ext': 'mp3',
|
||||||
|
'title': 'EM fodbold 1992 Danmark - Tyskland finale Transmission',
|
||||||
|
'description': 'md5:501e5a195749480552e214fbbed16c4e',
|
||||||
|
'thumbnail': 're:^https?://.*\.(?:gif|jpg)$',
|
||||||
|
'timestamp': 1223274900,
|
||||||
|
'upload_date': '20081006',
|
||||||
|
'duration': 7369,
|
||||||
|
},
|
||||||
|
}]
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
url_id = self._match_id(url)
|
||||||
|
webpage = self._download_webpage(url, url_id)
|
||||||
|
|
||||||
|
if url_id:
|
||||||
|
info = json.loads(self._html_search_regex(r'({.*?%s.*})' % url_id, webpage, 'json'))
|
||||||
|
else:
|
||||||
|
# Just fetch the first video on that page
|
||||||
|
info = json.loads(self._html_search_regex(r'bonanzaFunctions.newPlaylist\(({.*})\)', webpage, 'json'))
|
||||||
|
|
||||||
|
asset_id = str(info['AssetId'])
|
||||||
|
title = info['Title'].rstrip(' \'\"-,.:;!?')
|
||||||
|
duration = int_or_none(info.get('Duration'), scale=1000)
|
||||||
|
# First published online. "FirstPublished" contains the date for original airing.
|
||||||
|
timestamp = parse_iso8601(
|
||||||
|
re.sub(r'\.\d+$', '', info['Created']))
|
||||||
|
|
||||||
|
def parse_filename_info(url):
|
||||||
|
match = re.search(r'/\d+_(?P<width>\d+)x(?P<height>\d+)x(?P<bitrate>\d+)K\.(?P<ext>\w+)$', url)
|
||||||
|
if match:
|
||||||
|
return {
|
||||||
|
'width': int(match.group('width')),
|
||||||
|
'height': int(match.group('height')),
|
||||||
|
'vbr': int(match.group('bitrate')),
|
||||||
|
'ext': match.group('ext')
|
||||||
|
}
|
||||||
|
match = re.search(r'/\d+_(?P<bitrate>\d+)K\.(?P<ext>\w+)$', url)
|
||||||
|
if match:
|
||||||
|
return {
|
||||||
|
'vbr': int(match.group('bitrate')),
|
||||||
|
'ext': match.group(2)
|
||||||
|
}
|
||||||
|
return {}
|
||||||
|
|
||||||
|
video_types = ['VideoHigh', 'VideoMid', 'VideoLow']
|
||||||
|
preferencemap = {
|
||||||
|
'VideoHigh': -1,
|
||||||
|
'VideoMid': -2,
|
||||||
|
'VideoLow': -3,
|
||||||
|
'Audio': -4,
|
||||||
|
}
|
||||||
|
|
||||||
|
formats = []
|
||||||
|
for file in info['Files']:
|
||||||
|
if info['Type'] == "Video":
|
||||||
|
if file['Type'] in video_types:
|
||||||
|
format = parse_filename_info(file['Location'])
|
||||||
|
format.update({
|
||||||
|
'url': file['Location'],
|
||||||
|
'format_id': file['Type'].replace('Video', ''),
|
||||||
|
'preference': preferencemap.get(file['Type'], -10),
|
||||||
|
})
|
||||||
|
formats.append(format)
|
||||||
|
elif file['Type'] == "Thumb":
|
||||||
|
thumbnail = file['Location']
|
||||||
|
elif info['Type'] == "Audio":
|
||||||
|
if file['Type'] == "Audio":
|
||||||
|
format = parse_filename_info(file['Location'])
|
||||||
|
format.update({
|
||||||
|
'url': file['Location'],
|
||||||
|
'format_id': file['Type'],
|
||||||
|
'vcodec': 'none',
|
||||||
|
})
|
||||||
|
formats.append(format)
|
||||||
|
elif file['Type'] == "Thumb":
|
||||||
|
thumbnail = file['Location']
|
||||||
|
|
||||||
|
description = '%s\n%s\n%s\n' % (
|
||||||
|
info['Description'], info['Actors'], info['Colophon'])
|
||||||
|
|
||||||
|
for f in formats:
|
||||||
|
f['url'] = f['url'].replace('rtmp://vod-bonanza.gss.dr.dk/bonanza/', 'http://vodfiles.dr.dk/')
|
||||||
|
f['url'] = f['url'].replace('mp4:bonanza', 'bonanza')
|
||||||
|
self._sort_formats(formats)
|
||||||
|
|
||||||
|
display_id = re.sub(r'[^\w\d-]', '', re.sub(r' ', '-', title.lower())) + '-' + asset_id
|
||||||
|
display_id = re.sub(r'-+', '-', display_id)
|
||||||
|
|
||||||
|
return {
|
||||||
|
'id': asset_id,
|
||||||
|
'display_id': display_id,
|
||||||
|
'title': title,
|
||||||
|
'formats': formats,
|
||||||
|
'description': description,
|
||||||
|
'thumbnail': thumbnail,
|
||||||
|
'timestamp': timestamp,
|
||||||
|
'duration': duration,
|
||||||
|
}
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
x
Reference in New Issue
Block a user