Commit Graph

191106 Commits

Author SHA1 Message Date
Gleb Smirnoff
1fbe6a82f4 Improve reference counting of EXT_SFBUF pages attached to mbufs.
o Do not use UMA refcount zone. The problem with this zone is that
  several refcounting words (16 on amd64) share the same cache line,
  and issueing atomic(9) updates on them creates cache line contention.
  Also, allocating and freeing them is extra CPU cycles.
  Instead, refcount the page directly via vm_page_wire() and the sfbuf
  via sf_buf_alloc(sf_buf_page(sf)) [1].

o Call refcounting/freeing function for EXT_SFBUF via direct function
  call, instead of function pointer. This removes barrier for CPU
  branch predictor.

o Do not cleanup the mbuf to be freed in mb_free_ext(), merely to
  satisfy assertion in mb_dtor_mbuf(). Remove the assertion from
  mb_dtor_mbuf(). Use bcopy() instead of manual assignments to
  copy m_ext in mb_dupcl().

[1] This has some problems for now. Using sf_buf_alloc() merely to
    increase refcount is expensive, and is broken on sparc64. To be
    fixed.

Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2014-07-11 19:40:50 +00:00
Michael Tuexen
f64a0b069a Bugfix: When a remote address was added to an endpoint,
a source address was selected and cached, but it was not
stored that is was cached. This resulted in selecting
different source addresses for the INIT-ACK and COOKIE-ACK
when possible.
Thanks to Niu Zhixiong for reporting the issue.

MFC after: 1 week
2014-07-11 17:31:40 +00:00
Bryan Drewery
1d212ea524 Fix vmstat -M after r263620 renamed 'cnt' to 'vm_cnt'.
This was showing as:
  vmstat: undefined symbols:
   _cnt

To remain backwards compatible with older dumps, if 'vm_cnt' symbol is not
found then try again with 'cnt'.

Reported by:	pho
Sponsored by:	EMC / Isilon Storage Division
2014-07-11 16:45:55 +00:00
Cy Schubert
42834773b3 Remove redundant USE_INET6 test that enables INET6 in the ipfilter userland
regardless of the setting in make.conf.

PR:		190964
Approved by:	glebius (mentor)
MFC after:	1 week
2014-07-11 16:26:51 +00:00
John Baldwin
9f72c0322c Fix some edge cases with rewinddir():
- In the unionfs case, opendir() and fdopendir() read the directory's full
  contents and cache it.  This cache is not refreshed when rewinddir() is
  called, so rewinddir() will not notice updates to a directory.  Fix this
  by splitting the code to fetch a directory's contents out of
  __opendir_common() into a new _filldir() function and call this from
  rewinddir() when operating on a unionfs directory.
- If rewinddir() is called on a directory opened with fdopendir() before
  any directory entries are fetched, rewinddir() will not adjust the seek
  location of the backing file descriptor.  If the file descriptor passed
  to fdopendir() had a non-zero offset, the rewinddir() will not rewind to
  the beginning.  Fix this by always seeking back to 0 in rewinddir().
  This means the dd_rewind hack can also be removed.

While here, add missing locking to rewinddir().

CR:   	    	https://phabric.freebsd.org/D312
Reviewed by:	jilles
MFC after:	1 week
2014-07-11 16:16:26 +00:00
Gleb Smirnoff
fcc34a238c Fix style bug: rename the refcount field of m_ext to ext_cnt, to match
other members.

Sponsored by:	Nginx, Inc.
2014-07-11 14:34:29 +00:00
Gleb Smirnoff
15c28f87b8 All mbuf external free functions never fail, so let them be void.
Sponsored by:	Nginx, Inc.
2014-07-11 13:58:48 +00:00
Ed Maste
b0819f9877 Remove unused readline header
Readline is no longer installed after r268461.  A readline compatibility
header is provided by libedit, but readline definitions do not seem to
be used by LLDB anyhow.

Submitted by:	markj, Jan Beich
2014-07-11 07:31:55 +00:00
Michael Tuexen
4474d71a7b Integrate upstream changes.
MFC after: 1 week
2014-07-11 06:52:48 +00:00
Andrey V. Elsukov
ff899182ec Fix condition.
Sponsored by:	Yandex LLC
2014-07-11 06:34:15 +00:00
Marcel Moolenaar
5d4393ed1f Make this compile on older FreeBSD versions that don't have
APM_ENT_TYPE_APPLE_BOOT.
2014-07-11 01:49:25 +00:00
Neel Natu
3ada6e07ac Use the correct offset when converting a logical address (segment:offset)
to a linear address.
2014-07-11 01:23:38 +00:00
Mateusz Guzik
88f98985aa Eliminate plim and vtmp local vars in exit1.
No functional changes.

MFC after:	1 week
2014-07-10 22:54:38 +00:00
Mateusz Guzik
30d58d6b39 Don't make a temporary copy of fixed sysctl strings. 2014-07-10 21:46:57 +00:00
Warner Losh
9e88096ea1 Make MK_GNUCXX mean "build the libstdc++ and libsupc++ libraries" and
nothing more. Force it to be "no" when MK_CXX is "no" to simplify
usage.  It no longer also means "build g++" since we no longer have a
platform where that's interesting now that pc98 no longer needs clang
and gcc, but not g++. pc98 now just uses clang after boot2 changes.
2014-07-10 21:11:48 +00:00
Mateusz Guzik
b23c40d7b1 Don't zero fd_nfiles during fdp destruction.
Code trying to take a look has to check fd_refcnt and it is 0 by that time.

This is a follow up to r268505, without this the code would leak memory for
tables bigger than the default.

MFC after:	1 week
2014-07-10 21:05:45 +00:00
Mateusz Guzik
e518baf8f9 Avoid relocking filedesc lock when closing fds during fdp destruction.
Don't call bzero nor fdunused from fdfree for such cases. It would do
unnecessary work and complain that the lock is not taken.

MFC after:	1 week
2014-07-10 20:59:54 +00:00
Alan Cox
bfc30490a7 Correct the accounting code for wired mappings. The wrong field of the PVO
entry was being tested.  We were incrementing and decrementing the pmap's
wired mapping count based on whether the physical page being mapped or
unmapped was cache coherent, not whether it was a wired mapping.

Reviewed by:	nwhitehorn
2014-07-10 20:55:38 +00:00
Warner Losh
9e48836654 Separate out the links creation from the other targets. This was
supposed to have been done for the original commit, but somebody
forgot.

Pointy-hat-to:  imp@
2014-07-10 18:28:12 +00:00
Pedro F. Giffuni
c4a1f025cf Sync some (mostly cosmetical) changes from NetBSD
Makefile,v 1.37
tc1.c v 1.3
Rename TEST/test.c tc1.c

common.c,v 1.23
pass lint on _LP64.

emacs.c,v 1.22
pass lint on _LP64.

filecomplete.h,v 1.8
mv NetBSD ID back from 1.9 as we don't
have the widecharacter support.

prompt.c,v 1.14
prompt.h,v 1.9
term.h,v 1.20
read.h,v 1.6
Update NetBSD version strings

sys.h,v 1.12
Misc sun stuff.

tty.c 1.31
handle EINTR in the termios operations
Allow a single process to control multiple ttys (for pthreads using _REENTRANT)
using multiple EditLine objects.
pass lint on _LP64.
Don't depend on side effects inside an assert

MFC after:	1 week
Obtained from:	NetBSD
2014-07-10 17:52:17 +00:00
Mark Johnston
58e6549541 Correct the setting of the VID in transmit descriptors when hardware VLAN
tagging is enabled. This was broken in r266978.

Reported by:	gjb
Tested by:	gjb
2014-07-10 16:46:46 +00:00
Ed Schouten
e8df2232e0 Fix a couple of style nits.
- Use set instead of std::set, to be consistent with the rest of the file.
- Remove return (0); it's not required.
- Add a dash at the beginning of the copyright, per style(9).
2014-07-10 16:10:39 +00:00
Ed Schouten
63cdd39993 Don't use auto, as we also need to support GCC 4.2. 2014-07-10 15:58:28 +00:00
Ed Schouten
975f912456 Let users(1) use an std::set, instead of std::{vector,sort,unique}.
Reviewed by:	gahr
2014-07-10 15:56:15 +00:00
Baptiste Daroussin
55ba623622 Regenerate src.conf(5) after texinfo option change 2014-07-10 15:14:37 +00:00
Baptiste Daroussin
f471720995 The GNU texinfo and GNU info pages are not built and installed
anymore, WITH_INFO knob has been added to allow to built and install
them again.

Reviewed by:	imp
2014-07-10 15:05:41 +00:00
Ian Lepore
8d99c2a062 Pending interrupt status is cleared by writing to the ISR, not the data reg.
MFC after:	1 week
2014-07-10 14:06:18 +00:00
Pietro Cerutti
7150b86bfe Implement Short/Small String Optimization in SBUF(9) and change lengths and
positions in the API from ssize_t and int to size_t.

CR:		D388
Approved by:	des, bapt
2014-07-10 13:08:51 +00:00
Baptiste Daroussin
4472d6e1df Support EAGAIN in fetch_writev
Reviewed by:	des
Approved by:	des
2014-07-10 13:04:52 +00:00
Gleb Smirnoff
8ff2bd98d6 On machines with strict alignment copy pfsync_state_key from packet
on stack to avoid unaligned access.

PR:		187381
Submitted by:	Lytochkin Boris <lytboris gmail.com>
2014-07-10 12:41:58 +00:00
Pietro Cerutti
33aa643fa0 Reimplements users(1) in C++.
This reduces the lines of code by roughly 50% (not counting the COPYRIGHT
header) and makes it more readable by using standard algorithms.

Approved by:	bapt
2014-07-10 12:15:02 +00:00
Konstantin Belousov
479fcb4e32 Unconditionally initialize addr to handle the case of changed map
timestamp while the map is unlocked.

Reported by:	bz
Sponsored by:	The FreeBSD Foundation
MFC after:	6 days
2014-07-10 11:20:24 +00:00
Gavin Atkinson
4b829b3ee0 Reword an awkward option description
PR:		191726
Reported by:	yaneurabeya gmail.com
MFC after:	3 days
2014-07-10 10:00:10 +00:00
Kevin Lo
b11ce478cf Enable 8051 before downloading firmware.
Tested by:	Carlos Jacobo Puga Medina <cpm at fbsd dot es>
2014-07-10 09:42:34 +00:00
Bryan Venteicher
32487a8973 Rework when the Tx queue completion interrupt is enabled
The Tx interrupt is now kept disabled in the common case, only
enabled when the number of free descriptors in the queue falls
below a threshold. Transmitted frames are cleared from the VQ
before subsequent transmit, or in the watchdog timer.

This was a very big performance improvement for an experimental
Netmap bhyve backend.

MFC after:	1 month
2014-07-10 05:36:04 +00:00
Bryan Venteicher
4b59668f0e Add accessor to get the number of free descriptors in the virtqueue
MFC after:	1 month
2014-07-10 05:26:01 +00:00
Adrian Chadd
0a100a6f1e Implement the first stage of multi-bind listen sockets and RSS socket
awareness.

* Introduce IP_BINDMULTI - indicating that it's okay to bind multiple
  sockets on the same bind details.

  Although the PCB code has been taught about this (see below) this patch
  doesn't introduce the rest of the PCB changes necessary to distribute
  lookups among multiple PCB entries in the global wildcard table.

* Introduce IP_RSS_LISTEN_BUCKET - placing an listen socket into the
  given RSS bucket (and thus a single PCBGROUP hash.)

* Modify the PCB add path to be aware of IP_BINDMULTI:
  + Only allow further PCB entries to be added if the owner credentials
    and IP_BINDMULTI has been specified.  Ie, only allow further
    IP_BINDMULTI sockets to appear if the first bind() was IP_BINDMULTI.

* Teach the PCBGROUP code about IP_RSS_LISTE_BUCKET marked PCB entries.
  Instead of using the wildcard logic and hashing, these sockets are
  simply placed into the PCBGROUP and _not_ in the wildcard hash.

* When doing a PCBGROUP lookup, also do a wildcard match as well.
  This allows for an RSS bucket PCB entry to appear in a PCBGROUP
  rather than having to exist in the wildcard list.

Tested:

* TCP IPv4 server testing with igb(4)
* TCP IPv4 server testing with ix(4)

TODO:

* The pcbgroup lookup code duplicated the wildcard and wildcard-PCB
  logic.  This could be refactored into a single function.

* This doesn't yet work for IPv6 (The PCBGROUP code in netinet6/ doesn't
  yet know about this); nor does it yet fully work for UDP.
2014-07-10 03:10:56 +00:00
Warner Losh
23f02598b8 Now that pc98 no longer needs gcc to compile boot2, remove the special
case and treat it just like i386.
2014-07-10 00:15:55 +00:00
Warner Losh
aa0b5651c1 Compile boot2 with clang on pc98. 2014-07-10 00:15:50 +00:00
Warner Losh
53dda6a8d5 Make SERIAL support optional again. Enable it for i386 because a huge
percentage of machines has a 16550. Disable it for pc98 since only a
tiny fraction of them have one. These changes save 293 bytes when
building with clang, but preserves the ability to build with serial if
you really want.  We now have 92 bytes free (412 with the in-tree gcc).
2014-07-10 00:15:42 +00:00
Warner Losh
522d68a17f Merge the clang support from i386. Don't move to clang yet. 2014-07-10 00:15:38 +00:00
Xin LI
1b174fa1eb MFV r268455:
Use reserved space for ZFS administrative commands.

We reserve 1/2^spa_slop_shift = 1/32 or 3.125% of pool space (or 32MB at
least) for system use.  Most ZPL operations, e.g. write(2), creat(2), will
fail with ENOSPC if we fall below this.

Certain operations, e.g. file removal and most administrative actions,
still permitted until half of the slop space is used.  This would allow
users to use these operations to free up space in the pool when pool is
close to full but half of slop space is still free.

A very restricted set of operations that frees up space or change quota
are always permitted, regardless of the amount of free space.

MFC after:	 2 weeks
2014-07-09 23:14:59 +00:00
Aleksandr Rybalko
79b647995d Should check fb_read method presence instead of double check for fb_write.
Pointed by:     emaste

Sponsored by:	The FreeBSD Foundation
2014-07-09 21:55:34 +00:00
Konstantin Belousov
fd815c0b8d For safety, ensure that any consumer of the set_regs() and
ptrace_set_pc() use the correct return to userspace using iret.

The signal return, PT_CONTINUE (which in fact uses signal return path)
set the pcb flag already.  The setcontext(2) enforces iret return when
%rip is incorrect.  Due to this, the change is redundand, but is made
to ensure that no path which modifies context, forgets to set
PCB_FULL_IRET.

Inspired by:	CVE-2014-4699
Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2014-07-09 21:39:40 +00:00
Xin LI
b1396c9f98 MFV r268454:
Refresh zpool list for each interval in order to produce fresh
output.

Illumos issue: 4966 zpool list iterator does not update output

MFC after:	 2 weeks
2014-07-09 21:07:20 +00:00
Xin LI
ad9b19c1e8 MFV r268453:
Diff reduction against Illumos.

MFC after:	 2 weeks
2014-07-09 20:57:42 +00:00
Konstantin Belousov
a028ee5c9f Implement sysconf(_SC_GETGR_R_SIZE_MAX) and sysconf(_SC_GETPW_R_SIZE_MAX).
Reported by:	Dmitry Sivachenko <trtrmitya@gmail.com>
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2014-07-09 19:12:18 +00:00
Konstantin Belousov
a91831a261 Current code in sysctl proc.vmmap, which intent is to calculate the
amount of resident pages, in fact calculates the amount of installed
pte entries in the region.  Resident pages which were not soft-faulted
yet are not counted.

Calculate the amount of resident pages by looking in the objects chain
backing the region.

Add a knob to disable the residency calculation at all.  For large
sparce regions, either previous or updated algorithm runs for too long
time, while several introspection tools do not need the (advisory) RSS
value at all.

PR:	kern/188911
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2014-07-09 19:11:57 +00:00
Xin LI
fdc0ee2cf5 MFV r268452:
Explicitly mark file removal transactions as "presumed to result
in a net free of space" so they will not fail with ENOSPC.

Illumos issue:	4950 files sometimes can't be removed from a full
		filesystem
MFC after:	2 weeks
2014-07-09 18:32:40 +00:00
Dimitry Andric
3d12a34380 In libproc, avoid calling __cxa_demangle(), and thus depending on either
libcxxrt or libsupc++, if WITHOUT_CXX is defined.

Noticed by:	sbruno
MFC after:	1 week
2014-07-09 17:31:57 +00:00