132 Commits

Author SHA1 Message Date
Mark Johnston
f4e05cc55d Document fetchReqHTTP().
Submitted by:	Farhan Khan <khanzf@gmail.com>
Reviewed by:	0mp
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D18788
2019-08-28 17:01:28 +00:00
Dag-Erling Smørgrav
a768df3e91 When deciding whether to send the complete URL or just the document part,
we were looking at the original URL rather than the one we were currently
processing.  This meant that if we were trying to retrieve an HTTP URL but
were redirected to an HTTPS URL, and HTTPS proxying was enabled, we would
send an invalid request and most likely get garbage back.

MFC after:	3 days
2018-11-27 16:23:17 +00:00
Dag-Erling Smørgrav
ceedec4bce A few more cases where strcasecmp() is no longer required.
MFC after:	1 week
2018-11-27 11:22:19 +00:00
Dag-Erling Smørgrav
f2eac20246 Fix a few (but far from all) style issues.
MFC after:	3 weeks
2018-05-29 10:29:43 +00:00
Dag-Erling Smørgrav
c5712d6da1 Use __VA_ARGS__ to simplify the DEBUG macro.
MFC after:	3 weeks
2018-05-29 10:28:20 +00:00
Dag-Erling Smørgrav
b847b083a8 Preserve if-modified-since timestamps across redirects.
PR:		224426
MFC after:	1 week
2018-05-12 17:02:27 +00:00
Pedro F. Giffuni
5e53a4f90f lib: further adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using mis-identified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
2017-11-26 02:00:33 +00:00
Dag-Erling Smørgrav
08a49957b3 r308996 broke IP literals by assuming that a colon could only occur as
a separator between host and port, and using strchr() to search for it.
Rewrite fetch_resolve() so it handles bracketed literals correctly, and
remove similar code elsewhere to avoid passing unbracketed literals to
fetch_resolve().  Remove #ifdef INET6 so we still parse IP literals
correctly even if we do not have the ability to connect to them.

While there, fix an off-by-one error which caused HTTP 400 errors to be
misinterpreted as redirects.

PR:		217723
MFC after:	1 week
Reported by:	bapt, bz, cem, ngie
2017-03-17 14:18:52 +00:00
Dag-Erling Smørgrav
c8453e5bf4 Fix partial requests (used by fetch -r) when the requested file is
already complete.

Since 416 is an error code, any Content-Range header in the response
would refer to the error message, not the requested document, so
relying on the value of size when we know we got a 416 is wrong.
Instead, just verify that offset == 0 and assume that we've reached
the end of the document (if offset > 0, we did not request a range,
and the server is screwing with us).  Note that we cannot distinguish
between reaching the end and going past it, but that is a flaw in the
protocol, not in the code, so we just have to assume that the caller
knows what it's doing.  A smart caller would request an offset
slightly before what it believes is the end and compare the result to
what is already in the file.

PR:		212065
Reported by:	mandree
MFC after:	3 weeks
2017-03-05 12:06:45 +00:00
Dag-Erling Smørgrav
21ca0912c6 Fix inverted loop condition which broke multi-line responses to CONNECT.
PR:		194483
Submitted by:	Miłosz Kaniewski <milosz.kaniewski@gmail.com>
MFC after:	1 week
2016-12-30 14:54:54 +00:00
Dag-Erling Smørgrav
a5fc9a29bb r169386 (PR 112515) was incomplete: it treated 307 as an error except
in verbose mode, and did not handle 308 at all.

r241840 (PR 172451) added support for 308, but with the same bug.

Correctly handle both by recognizing them as redirects in all places
where we check the HTTP result code.

PR:		112515 173451 209546
Submitted by:	novel@
MFC after:	1 week
2016-05-31 08:27:39 +00:00
Don Lewis
77b822dbc0 Use strlcpy() instead of strncpy() to copy the string returned by
setlocale() so that static analyzers know that the string is NUL
terminated.  This was causing a false positive in Coverity even
though the longest string returned by setlocale() is ENCODING_LEN
(31) and we are copying into a 64 byte buffer.  This change is also
a bit of an optimization since we don't need the strncpy() feature
of padding the rest of the destination buffer with NUL characters.

Reported by:	Coverity
CID:		974654
2016-05-12 06:39:13 +00:00
Dag-Erling Smørgrav
a982c4c7f5 Fix double-free error: r289419 moved all error handling in http_connect()
to the end of the function, but did not remove a fetch_close() call which
was made redundant by the one in the shared error-handling code.

PR:		206774
Submitted by:	Christian Heckendorf <heckendorfc@gmail.com>
MFC after:	3 days
2016-02-11 17:48:15 +00:00
Dag-Erling Smørgrav
adc1aa7a29 As a followup to r292330, standardize on size_t and add a few comments. 2015-12-16 09:20:45 +00:00
Dag-Erling Smørgrav
a568844c67 Reset bufpos to 0 immediately after refilling the buffer. Otherwise, we
risk leaving the connection in an indeterminate state if the server fails
to send a chunk delimiter.  Depending on the application and on the sizes
of the preceding chunks, the result can be anything from missing data to a
segfault.  With this patch, it will be reported as a protocol error.

PR:		204771
MFC after:	1 week
2015-12-16 09:17:07 +00:00
Dimitry Andric
a1b9b1743c Fix buildworld after r291453, similar to r284346: url->user and url->pwd
are arrays, so they can never be NULL.

Reported by:	many
Pointy hat to:	des
2015-11-29 22:37:48 +00:00
Dag-Erling Smørgrav
4d8b056ef1 Use .netrc for HTTP sites and proxies, not just FTP.
PR:		193740
Submitted by:	TEUBEL György <tgyurci@gmail.com>
MFC after:	1 week
2015-11-29 14:26:59 +00:00
Dag-Erling Smørgrav
c3f9b93bd9 Fix two bugs in HTTPS tunnelling:
- If the proxy returns a non-200 result, set the error code accordingly
   so the caller / user gets a somewhat meaningful error message.
 - Consume and discard any HTTP response header following the result line.

PR:		194483
Tested by:	Fabian Keil <fk@fabiankeil.de>
MFC after:	1 week
2015-10-16 12:21:44 +00:00
Marcelo Araujo
ddcc2ecb3a Remove unused variable to silence clang warning.
Differential Revision:	D2683
Reviewed by:		rodrigc, bapt
2015-07-04 17:22:07 +00:00
Dimitry Andric
f2c41c554d Fix the following clang 3.7.0 warnings in lib/libfetch/http.c:
lib/libfetch/http.c:1628:26: error: address of array 'purl->user'
    will always evaluate to 'true' [-Werror,-Wpointer-bool-conversion]
                                    aparams.user = purl->user ?
                                                   ~~~~~~^~~~ ~
    lib/libfetch/http.c:1630:30: error: address of array 'purl->pwd'
    will always evaluate to 'true' [-Werror,-Wpointer-bool-conversion]
                                    aparams.password = purl->pwd?
                                                       ~~~~~~^~~~
    lib/libfetch/http.c:1657:25: error: address of array 'url->user'
    will always evaluate to 'true' [-Werror,-Wpointer-bool-conversion]
                                    aparams.user = url->user ?
                                                   ~~~~~^~~~ ~
    lib/libfetch/http.c:1659:29: error: address of array 'url->pwd'
    will always evaluate to 'true' [-Werror,-Wpointer-bool-conversion]
                                    aparams.password = url->pwd ?
                                                       ~~~~~^~~ ~
    lib/libfetch/http.c:1669:25: error: address of array 'url->user'
    will always evaluate to 'true' [-Werror,-Wpointer-bool-conversion]
                                    aparams.user = url->user ?
                                                   ~~~~~^~~~ ~
    lib/libfetch/http.c:1671:29: error: address of array 'url->pwd'
    will always evaluate to 'true' [-Werror,-Wpointer-bool-conversion]
                                    aparams.password = url->pwd ?
                                                       ~~~~~^~~ ~

Since url->user and url->pwd are arrays, they can never be NULL, so the
checks can be removed.

Reviewed by:	bapt
MFC after:	3 days
Differential Revision: https://reviews.freebsd.org/D2673
2015-06-13 19:26:48 +00:00
Baptiste Daroussin
c41991303c Add support for arbitrary http requests
Submitted by:	Alex Hornung <alex@alexhornung.com>
Reviewed by:	des
Obtained from:	Dragonfly
MFC after:	3 week
2014-06-05 22:16:26 +00:00
Baptiste Daroussin
4bd8c06c3a Remove unnecessary semicolons
Patch by Sascha Wildner <saw@online.de> for Dragonfly

Reviewed by:	des
Obtained from:	Dragonfly
MFC after:	1 week
2014-06-05 22:13:30 +00:00
Baptiste Daroussin
6064928d01 Use NULL instead of 0
Patch by Sascha Wildner <saw@online.de> for Dragonfly

Reviewed by:	des
Obtained from:	Dragonfly
MFC after:	1 week
2014-06-05 22:10:25 +00:00
Dag-Erling Smørgrav
c257f99e9b If HTTP_USER_AGENT is defined but empty, don't send User-Agent at all.
PR:		184507
Submitted by:	jbeich@tormail.org (with modifications)
MFC after:	1 week
2014-06-05 20:27:16 +00:00
Bryan Drewery
b36853caf1 Support Last-Modified behind proxies which return UTC instead of GMT.
The standard states that GMT must be used, but that UTC is equivalent. Still
parse UTC as otherwise this causes problems for pkg(8). It will refetch
the repository every time 'pkg update' or other remote operations
are used behind these proxies.

RFC2616: "All HTTP date/time stamps MUST be represented in Greenwich Mean
Time (GMT), without exception. For the purposes of HTTP, GMT is exactly equal
to UTC (Coordinated Universal Time).""

Approved by:	bapt (mentor)
Reviewed by:	des, peter
Sponsored by:	EMC / Isilon Storage Division
MFC after:	1 week
2014-03-11 13:47:11 +00:00
Dag-Erling Smørgrav
4524013cd3 Bump copyright dates 2014-01-30 08:37:23 +00:00
Dag-Erling Smørgrav
9c1ca3a1dd r261230 broke the cases where the amount of data to be read is not
known in advance, or where the caller doesn't care and just keeps
reading until it hits EOF.

In fetch_read(): the socket is non-blocking, so read() will return 0
on EOF, and -1 (errno == EAGAIN) when the connection is still open but
there is no data waiting.  In the first case, we should immediately
return 0.  The EINTR case was also broken, although not in a way that
matters.

In fetch_writev(): use timersub() and timercmp() as in fetch_read().

In http_fillbuf(): set errno to a sensible value when an invalid chunk
header is encountered.

In http_readfn(): as in fetch_read(), a zero return from down the
stack indicates EOF, not an error.  Furthermore, when io->error is
EINTR, clear it (but no errno) before returning so the caller can
retry after dealing with the interrupt.

MFC after:	3 days
2014-01-29 12:48:19 +00:00
Dag-Erling Smørgrav
215a27f1a4 Solve http buffering issues and hangs once and for all (hopefully!) by
simply not trying to return exactly what the caller asked for - just
return whatever we got and let the caller be the judge of whether it
was enough.  If an error occurs or the connection times out after we
already received some data, return a short read, under the assumption
that the next call will fail or time out before we read anything.

As it turns out, none of the code that calls fetch_read() assumes an
all-or-nothing result anyway, except for a couple of lines where we
read the CR LF at the end of a hunk in HTTP hunked encoding, so the
changes outside of fetch_read() and http_readfn() are minimal.

While there, replace select(2) with poll(2).

MFC after:	3 days
2014-01-28 12:48:17 +00:00
Dag-Erling Smørgrav
615c5740ef Even though it doesn't really make sense in the context of a CONNECT
request, RFC 2616 14.23 mandates the presence of the Host: header in
all HTTP 1.1 requests.

PR:		kern/181445
Submitted by:	Kimo <kimor79@yahoo.com>
MFC after:	3 days
2013-08-22 07:43:36 +00:00
Dag-Erling Smørgrav
1453595f49 Include an Accept header in requests.
PR:		kern/180917
MFC after:	1 week
2013-07-30 13:07:55 +00:00
Dag-Erling Smørgrav
dcd47379ff Implement certificate verification, and many other SSL-related
imrovements; complete details in the PR.

PR:		kern/175514
Submitted by:	Michael Gmelin <freebsd@grem.de>
MFC after:	1 week
2013-07-26 15:53:43 +00:00
Dag-Erling Smørgrav
ba7c6aec97 Use the correct request syntax for proxied (tunneled) HTTPS requests.
PR:		bin/180666
MFC after:	3 days
2013-07-21 06:59:56 +00:00
Dag-Erling Smørgrav
4056bae982 Use the CONNECT method to proxy HTTPS connections through HTTP proxies.
PR:		bin/80176
Submitted by:	Yuichiro NAITO <naito.yuichiro@gmail.com>
2013-04-12 22:05:15 +00:00
Dag-Erling Smørgrav
eab7a548ba Fix weird indentation. 2012-11-16 12:31:43 +00:00
Eitan Adler
8d049fb235 Implement HTTP 305 redirect handling.
PR:		172452
Submitted by:	gcooper
Reviewed by:	des
Approved by:	cperciva
MFC after:	1 week
2012-10-22 03:00:15 +00:00
Eitan Adler
c4fa1489ec Don't deny non-temporary redirects if the -A option is set (per
the man page) [0]

While here add support for draft-reschke-http-status-308-07

PR:		172451 [0]
Submitted by:	gcooper [0]
Reviewed by:	des
Approved by:	cperciva
MFC after:	1 week
2012-10-22 03:00:10 +00:00
Eitan Adler
e6c0e200f4 Be a bit more lenient in the maximum number of redirects allowed.
Chrome and Firefox have a limit of 20. IE has a limit of 8.

Reviewed by:	des
Approved by:	cperciva
MFC after:	3 days
2012-10-22 03:00:04 +00:00
Dag-Erling Smørgrav
0e50a83330 Use libmd if and only if OpenSSL is not available.
PR:		bin/171402
MFC after:	3 days
2012-09-14 13:00:43 +00:00
Dag-Erling Smørgrav
f51b84bcc4 Don't reuse credentials if redirected to a different host.
Submitted by:	Niels Heinen <heinenn@google.com>
MFC after:	3 weeks
2012-04-30 12:12:48 +00:00
Dag-Erling Smørgrav
2a7daafe67 Fix two issues related to the use of SIGINFO in fetch(1) to display
progress information.  The first is that fetch_read() (used in the HTTP
code but not the FTP code) can enter an infinite loop if it has previously
been interrupted by a signal.  The second is that when it is interrupted,
fetch_read() will discard any data it may have read up to that point.
Luckily, both bugs are extremely timing-sensitive and therefore difficult
to trigger.

PR:		bin/153240
Submitted by:	Mark <markjdb@gmail.com>
MFC after:	3 weeks
2012-01-18 15:13:21 +00:00
Dag-Erling Smørgrav
578153f1ba latin1 -> utf8 2011-10-19 11:43:51 +00:00
Dag-Erling Smørgrav
6337341d81 Update copyright dates and strip my middle name. 2011-09-27 18:57:26 +00:00
Dag-Erling Smørgrav
eb9b80c30d Increase WARNS to 4. 2011-05-12 21:26:42 +00:00
Dag-Erling Smørgrav
c12c6e3cda Mechanical whitespace cleanup. 2011-05-12 21:18:55 +00:00
Dag-Erling Smørgrav
a42eecded0 Increase WARNS to 3. 2011-05-12 21:12:24 +00:00
Dag-Erling Smørgrav
c954ded250 Fix a couple of embarrassing mistakes in the previous commit.
Submitted by:	Dimitry Andric <dimitry@andric.com>
2010-07-28 15:29:18 +00:00
Dag-Erling Smørgrav
962cf29525 If the A flag is supplied, http_request() will attempt the request only
once, even if authentication is required, instead of retrying with the
proper credentials.  Fix this by bumping the countdown if the origin or
proxy server requests authentication so that the initial unauthenticated
request does not count as an attempt.

PR:		148087
Submitted by:	Tom Evans <tevans.uk@googlemail.com>
MFC after:	2 weeks
2010-07-01 17:44:33 +00:00
Dag-Erling Smørgrav
79ad329d0c Add HTTP digest authentication.
Submitted by:	Jean-Francois Dockes <jf@dockes.org>
Forgotten by:	des (repeatedly)
2010-01-19 10:19:55 +00:00
Murray Stokely
7f92799f67 Add support for HTTP 1.1 If-Modified-Since behavior.
fetch(1) accepts a new argument -i <file> that if specified will cause
the file to be downloaded only if it is more recent than the mtime of
<file>.

libfetch(3) accepts the mtime in the url structure and a flag to
indicate when this behavior is desired.

PR:		bin/87841
Submitted by:	Jukka A. Ukkonen <jau@iki.fi> (partially)
Reviewed by:	des, ru
MFC after:	3 weeks
2008-12-15 08:27:44 +00:00
Ruslan Ermilov
e374393a07 Don't fail mistakenly with -r when we already have the whole file.
Reviewed by:	des
2008-10-24 07:56:01 +00:00