Skip to content

Commit 48cef54

Browse files
samuelophsergiodj
andcommitted
Introduce new options for downloading a list of URLs from file
There are two ways of doing this now: 1) wget way: -i, --input-file 2) curl way: providing an URL argument starting with "@", wcurl will see "@filename" and download URLs from "filename". Lines starting with "#" inside input files are ignored. This is a continuation of #58 Co-authored-by: Sergio Durigan Junior <[email protected]>
1 parent 465dec7 commit 48cef54

File tree

3 files changed

+76
-13
lines changed

3 files changed

+76
-13
lines changed

README.md

Lines changed: 11 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -42,8 +42,9 @@ sudo mandb
4242

4343
```text
4444
wcurl <URL>...
45-
wcurl [--curl-options <CURL_OPTIONS>]... [--no-decode-filename] [-o|-O|--output <PATH>] [--dry-run] [--] <URL>...
46-
wcurl [--curl-options=<CURL_OPTIONS>]... [--no-decode-filename] [--output=<PATH>] [--dry-run] [--] <URL>...
45+
wcurl -i <INPUT_FILE>...
46+
wcurl [--curl-options <CURL_OPTIONS>]... [--no-decode-filename] [-o|-O|--output <PATH>] [-i|--input-file <PATH>]... [--dry-run] [--] [<URL>]...
47+
wcurl [--curl-options=<CURL_OPTIONS>]... [--no-decode-filename] [--output=<PATH>] [--input-file=<PATH>]... [--dry-run] [--] [<URL>]...
4748
wcurl -V|--version
4849
wcurl -h|--help
4950
```
@@ -85,6 +86,12 @@ should be using curl directly if your use case is not covered.
8586
the end (curl >= 7.83.0). If this option is provided multiple times, only the
8687
last value is considered.
8788

89+
* `-i, --input-file=<PATH>`
90+
91+
Download all URLs listed in the input file. Can be used multiple times and
92+
mixed with URLs as parameters. This is equivalent to setting `@\<PATH\>` as an
93+
URL argument. Lines starting with `#` are ignored.
94+
8895
* `--no-decode-filename`
8996

9097
Don't percent-decode the output filename, even if the percent-encoding in the
@@ -112,6 +119,8 @@ instead forwarded to the curl invocation.
112119
URL to be downloaded. Anything that is not a parameter is considered
113120
an URL. Whitespaces are percent-encoded and the URL is passed to curl, which
114121
then performs the parsing. May be specified more than once.
122+
Arguments starting with `@` are considered as a file containing multiple URLs to
123+
be downloaded; `@\<PATH\>` is equivalent to using `--input-file \<PATH\>`.
115124

116125
# Examples
117126

wcurl

Lines changed: 53 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -49,8 +49,9 @@ usage()
4949
${PROGRAM_NAME} -- a simple wrapper around curl to easily download files.
5050
5151
Usage: ${PROGRAM_NAME} <URL>...
52-
${PROGRAM_NAME} [--curl-options <CURL_OPTIONS>]... [--no-decode-filename] [-o|-O|--output <PATH>] [--dry-run] [--] <URL>...
53-
${PROGRAM_NAME} [--curl-options=<CURL_OPTIONS>]... [--no-decode-filename] [--output=<PATH>] [--dry-run] [--] <URL>...
52+
${PROGRAM_NAME} -i <INPUT_FILE>...
53+
${PROGRAM_NAME} [--curl-options <CURL_OPTIONS>]... [--no-decode-filename] [-o|-O|--output <PATH>] [-i|--input-file <PATH>]... [--dry-run] [--] [<URL>]...
54+
${PROGRAM_NAME} [--curl-options=<CURL_OPTIONS>]... [--no-decode-filename] [--output=<PATH>] [--input-file=<PATH>]... [--dry-run] [--] [<URL>]...
5455
${PROGRAM_NAME} -h|--help
5556
${PROGRAM_NAME} -V|--version
5657
@@ -64,6 +65,10 @@ Options:
6465
number appended to the end (curl >= 7.83.0). If this option is provided
6566
multiple times, only the last value is considered.
6667
68+
-i, --input-file <PATH>: Download all URLs listed in the input file. Can be used multiple times
69+
and mixed with URLs as parameters. This is equivalent to setting
70+
"@<PATH>" as an URL argument. Lines starting with "#" are ignored.
71+
6772
--no-decode-filename: Don't percent-decode the output filename, even if the percent-encoding in
6873
the URL was done by wcurl, e.g.: The URL contained whitespaces.
6974
@@ -79,6 +84,8 @@ Options:
7984
<URL>: URL to be downloaded. Anything that is not a parameter is considered
8085
an URL. Whitespaces are percent-encoded and the URL is passed to curl, which
8186
then performs the parsing. May be specified more than once.
87+
Arguments starting with "@" are considered as a file containing multiple URLs to be
88+
downloaded; "@<PATH>" is equivalent to using "--input-file <PATH>".
8289
_EOF_
8390
}
8491

@@ -116,6 +123,34 @@ readonly PER_URL_PARAMETERS="\
116123
# Whether to invoke curl or not.
117124
DRY_RUN="false"
118125

126+
# Add URLs to list of URLs to be downloaded.
127+
# If the argument starts with "@", then it's a file containing the URLs
128+
# to be downloaded (an "input file").
129+
# When parsing an input file, ignore lines starting with "#".
130+
# This function also percent-encodes the whitespaces in URLs.
131+
add_urls()
132+
{
133+
case "$1" in
134+
@*)
135+
while read -r url; do
136+
case "$url" in
137+
\#*) : ;;
138+
*)
139+
# Percent-encode whitespaces into %20, since wget supports those URLs.
140+
newurl=$(printf "%s\n" "${url}" | sed 's/ /%20/g')
141+
URLS="${URLS} ${newurl}"
142+
;;
143+
esac
144+
done < "${1#@}"
145+
;;
146+
*)
147+
# Percent-encode whitespaces into %20, since wget supports those URLs.
148+
newurl=$(printf "%s\n" "${1}" | sed 's/ /%20/g')
149+
URLS="${URLS} ${newurl}"
150+
;;
151+
esac
152+
}
153+
119154
# Sanitize parameters.
120155
sanitize()
121156
{
@@ -279,6 +314,19 @@ while [ -n "${1-}" ]; do
279314
OUTPUT_PATH="${opt}"
280315
;;
281316

317+
--input-file=*)
318+
add_urls "@$(printf "%s\n" "${1}" | sed 's/^--input-file=//')"
319+
;;
320+
321+
-i | --input-file)
322+
shift
323+
add_urls "@${1}"
324+
;;
325+
326+
-i*)
327+
add_urls "@$(printf "%s\n" "${1}" | sed 's/^-i//')"
328+
;;
329+
282330
--no-decode-filename)
283331
DECODE_FILENAME="false"
284332
;;
@@ -296,10 +344,8 @@ while [ -n "${1-}" ]; do
296344
--)
297345
# This is the start of the list of URLs.
298346
shift
299-
for url in "$@"; do
300-
# Encode whitespaces into %20, since wget supports those URLs.
301-
newurl=$(printf "%s\n" "${url}" | sed 's/ /%20/g')
302-
URLS="${URLS} ${newurl}"
347+
for arg in "$@"; do
348+
add_urls "${arg}"
303349
done
304350
break
305351
;;
@@ -310,9 +356,7 @@ while [ -n "${1-}" ]; do
310356

311357
*)
312358
# This must be a URL.
313-
# Encode whitespaces into %20, since wget supports those URLs.
314-
newurl=$(printf "%s\n" "${1}" | sed 's/ /%20/g')
315-
URLS="${URLS} ${newurl}"
359+
add_urls "${1}"
316360
;;
317361
esac
318362
shift

wcurl.md

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -18,9 +18,11 @@ Added-in: n/a
1818

1919
**wcurl \<URL\>...**
2020

21-
**wcurl [--curl-options \<CURL_OPTIONS\>]... [--dry-run] [--no-decode-filename] [-o|-O|--output \<PATH\>] [--] \<URL\>...**
21+
**wcurl -i \<INPUT_FILE\>...**
2222

23-
**wcurl [--curl-options=\<CURL_OPTIONS\>]... [--dry-run] [--no-decode-filename] [--output=\<PATH\>] [--] \<URL\>...**
23+
**wcurl [--curl-options \<CURL_OPTIONS\>]... [--dry-run] [--no-decode-filename] [-o|-O|--output \<PATH\>] [-i|--input-file \<PATH\>]... [--] [\<URL\>]...**
24+
25+
**wcurl [--curl-options=\<CURL_OPTIONS\>]... [--dry-run] [--no-decode-filename] [--output=\<PATH\>] [--input-file=\<PATH\>]... [--] [\<URL\>]...**
2426

2527
**wcurl -V|--version**
2628

@@ -82,6 +84,12 @@ URLs are provided, resulting files share the same name with a number appended to
8284
the end (curl \>= 7.83.0). If this option is provided multiple times, only the
8385
last value is considered.
8486

87+
## -i, --input-file=\<PATH\>
88+
89+
Download all URLs listed in the input file. Can be used multiple times and
90+
mixed with URLs as parameters. This is equivalent to setting `@\<PATH\>` as an
91+
URL argument. Lines starting with `#` are ignored.
92+
8593
## --no-decode-filename
8694

8795
Don't percent-decode the output filename, even if the percent-encoding in the
@@ -109,6 +117,8 @@ is instead forwarded to the curl invocation.
109117
URL to be downloaded. Anything that is not a parameter is considered
110118
an URL. Whitespaces are percent-encoded and the URL is passed to curl, which
111119
then performs the parsing. May be specified more than once.
120+
Arguments starting with `@` are considered as a file containing multiple URLs to be
121+
downloaded; `@\<PATH\>` is equivalent to using `--input-file \<PATH\>`.
112122

113123
# EXAMPLES
114124

0 commit comments

Comments
 (0)