and receives frames on any port of the lagg(4).
Phabric: D549
Reviewed by: glebius, thompsa
Approved by: glebius
Obtained from: OpenBSD
Sponsored by: QNAP Systems Inc.
devq_openings counter lost its meaning after allocation queues has gone.
held counter is still meaningful, but problematic to update due to separate
locking of CCB allocation and queuing.
To fix that replace devq_openings counter with allocated counter. held is
now calculated on request as difference between number of allocated, queued
and active CCBs.
MFC after: 1 month
It affects the IPv6 source address selection algorithm (RFC 6724)
and allows override the last rule ("longest matching prefix") for
choosing among equivalent addresses. The address with `prefer_source'
will be preferred source address.
Obtained from: Yandex LLC
MFC after: 1 month
Sponsored by: Yandex LLC
Kernel changes:
* Split kernel/userland nat structures eliminating IPFW_INTERNAL hack.
* Add IP_FW_NAT44_* codes resemblin old ones.
* Assume that instances can be named (no kernel support currently).
* Use both UH+WLOCK locks for all configuration changes.
* Provide full ABI support for old sockopts.
Userland changes:
* Use IP_FW_NAT44_* codes for nat operations.
* Remove undocumented ability to show ranges of nat "log" entries.
on "ifconfig -v". I've seen no measurable timing difference
for doing additional SIOCGI2C call for system with 4k vlans.
* Determine appropriate handler (SFP/QSFP) by reading identification
byte (which is the same for both SFF-8472 and SFF-8436) instead
of checking driver name.
MFC with: r270064
Sponsored by: Yandex LLC
This is the last major change in given branch.
Kernel changes:
* Use 64-bytes structures to hold multi-value variables.
* Use shared array to hold values from all tables (assume
each table algo is capable of holding 32-byte variables).
* Add some placeholders to support per-table value arrays in future.
* Use simple eventhandler-style API to ease the process of adding new
table items. Currently table addition may required multiple UH drops/
acquires which is quite tricky due to atomic table modificatio/swap
support, shared array resize, etc. Deal with it by calling special
notifier capable of rolling back state before actually performing
swap/resize operations. Original operation then restarts itself after
acquiring UH lock.
* Bump all objhash users default values to at least 64
* Fix custom hashing inside objhash.
Userland changes:
* Add support for dumping shared value array via "vlist" internal cmd.
* Some small print/fill_flags dixes to support u32 values.
* valtype is now bitmask of
<skipto|pipe|fib|nat|dscp|tag|divert|netgraph|limit|ipv4|ipv6>.
New values can hold distinct values for each of this types.
* Provide special "legacy" type which assumes all values are the same.
* More helpers/docs following..
Some examples:
3:41 [1] zfscurr0# ipfw table mimimi create valtype skipto,limit,ipv4,ipv6
3:41 [1] zfscurr0# ipfw table mimimi info
+++ table(mimimi), set(0) +++
kindex: 2, type: addr
references: 0, valtype: skipto,limit,ipv4,ipv6
algorithm: addr:radix
items: 0, size: 296
3:42 [1] zfscurr0# ipfw table mimimi add 10.0.0.5 3000,10,10.0.0.1,2a02:978:2::1
added: 10.0.0.5/32 3000,10,10.0.0.1,2a02:978:2::1
3:42 [1] zfscurr0# ipfw table mimimi list
+++ table(mimimi), set(0) +++
10.0.0.5/32 3000,0,10.0.0.1,2a02:978:2::1
optional attributes field.
- Add a 'machdep.smap' sysctl that exports the SMAP table of the running
system as an array of the ACPI 3.0 structure. (On older systems, the
attributes are given a value of zero.) Note that the sysctl only
exports the SMAP table if it is available via the metadata passed from
the loader to the kernel. If an SMAP is not available, an empty array
is returned.
- Add a format handler for the ACPI 3.0 SMAP structure to the sysctl(8)
binary to format the SMAP structures in a readable format similar to
the format found in boot messages.
MFC after: 2 weeks
* Convert ixgbe to use this ioctl
* Convert ifconfig to use generic i2c handler for "ix" interfaces.
Approved by: Eric Joyner (ixgbe part)
MFC after: 2 weeks
Sponsored by: Yandex LLC
The executable itself doesn't contain any privileged information.
An example of where this is useful is when makefs(8) is creating an image
that includes /sbin/shutdown. This can now be done without root privileges.
Reviewed by: delphij
Discussed with: delphij, des
CR: https://reviews.freebsd.org/D662
QSFP+ data via i2c inteface. These constants has been taken
from SFF-8436 "QSFP+ 10 Gbs 4X PLUGGABLE TRANSCEIVER" standard
rev 4.8.
* Add support for printing QSFP+ information from 40G NICs
such as Chelsio T5.
This commit does not contain ioctl changes necessary for this
functionality work, there will be another commit soon.
Example:
cxl1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=ec07bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,.....>
ether 00:07:43:28:ad:08
nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
media: Ethernet 40Gbase-LR4 <full-duplex>
status: active
plugged: QSFP+ 40GBASE-LR4 (MPO Parallel Optic)
vendor: OEM PN: OP-QSFP-40G-LR4 SN: 20140318001 DATE: 2014-03-18
module temperature: 64.06 C voltage: 3.26 Volts
lane 1: RX: 0.47 mW (-3.21 dBm) TX: 2.78 mW (4.46 dBm)
lane 2: RX: 0.20 mW (-6.94 dBm) TX: 2.80 mW (4.47 dBm)
lane 3: RX: 0.18 mW (-7.38 dBm) TX: 2.79 mW (4.47 dBm)
lane 4: RX: 0.90 mW (-0.45 dBm) TX: 2.80 mW (4.48 dBm)
Tested on: Chelsio T5
Tested on: Mellanox/Huawei passive/active cables/transceivers.
MFC after: 2 weeks
Sponsored by: Yandex LLC
sbin/devd/tests/client_test.c
* In the event that popen fails, don't dereference its return value.
* Fix array overwrite in the stream and seqpacket tests.
* Close sockets at the end of successful ATF tests.
Reported by: Coverity scan
CID: 1232019, 1232020, 1232029, 1232030
MFC after: 1 week
Sponsored by: Spectra Logic
1. 50+% of NO_PIE use is fixed by adding -fPIC to INTERNALLIB and other
build-only utility libraries.
2. Another 40% is fixed by generating _pic.a variants of various libraries.
3. Some of the NO_PIE use is a bit absurd as it is disabling PIE (and ASLR)
where it never would work anyhow, such as csu or loader. This suggests
there may be better ways of adding support to the tree. Many of these
cases can be fixed such that -fPIE will work but there is really no
reason to have it in those cases.
4. Some of the uses are working around hacks done to some Makefiles that are
really building libraries but have been using bsd.prog.mk because the code
is cleaner. Had they been using bsd.lib.mk then NO_PIE would not have
been needed.
We likely do want to enable PIE by default (opt-out) for non-tree consumers
(such as ports). For in-tree though we probably want to only enable PIE
(opt-in) for common attack targets such as remote service daemons and setuid
utilities. This is also a great performance compromise since ASLR is expected
to reduce performance. As such it does not make sense to enable it in all
utilities such as ls(1) that have little benefit to having it enabled.
Reported by: kib
UNIX systems, eg. MacOS X and Solaris. It uses Sun-compatible map format,
has proper kernel support, and LDAP integration.
There are still a few outstanding problems; they will be fixed shortly.
Reviewed by: allanjude@, emaste@, kib@, wblock@ (earlier versions)
Phabric: D523
MFC after: 2 weeks
Relnotes: yes
Sponsored by: The FreeBSD Foundation
presenting most interesting fields via ifconfig -v.
This version supports Intel ixgbe driver only.
Tested on: Cisco,Intel,Mellanox,ModuleTech,Molex transceivers
MFC after: 2 weeks
mount_nfs effectively uses mount protocol v3 by default already.
v1 mount protocol is being removed along with nfsv2 by a high profile NFS
appliance vendor and our legacy v1 mount protocol usage causes rpc errors.
Makefile.inc1:
Always compile gensnmptree with bootstrap-tools when MK_BSNMP != no
instead of depending on a potentially stale tool installed on the build host
sbin/atm/atmconfig/Makefile:
- Always remove oid.h to avoid cluttering up the build/src tree.
- Consolidate all of the RESCUE/MK_BSNMP != no logic under one
conditional to improve readability
- Remove unnecessary ${.OBJDIR} prefixing for oid.h and use ${.TARGET} instead
of spelling out oid.h
- Add a missing DPADD for ${LIBCRYPTO} when compiled MK_BSNMP == yes and
MK_OPENSSL == yes and not compiling for /rescue/rescue
sbin/atm/atmconfig/main.c:
Change #ifndef RESCUE to #ifdef WITH_BSNMP in main.c to make it
clear that we're compiling bsnmp support into atmconfig
Approved by: jmmv (mentor)
Phabric: D579
PR: 143830
MFC after: 2 weeks
Sponsored by: EMC / Isilon Storage Division
This change consists of two merges from projects/zfsd/head along with the
addition of an ATF test case for the new functionality.
sbin/devd/tests/Makefile
sbin/devd/tests/client_test.c
Add ATF test cases for reading events from both devd socket types.
r266519:
sbin/devd/devd.8
sbin/devd/devd.cc
Create a new socket, of type SOCK_SEQPACKET, for communicating with
clients. SOCK_SEQPACKET sockets preserve record boundaries,
simplying code in the client. The old SOCK_STREAM socket is retained
for backwards-compatibility with existing clients.
r269993:
sbin/devd/devd.8
Fix grammar bug.
CR: https://reviews.freebsd.org/rS266519
MFC after: 5 days
Sponsored by: Spectra Logic
Microsoft recommends avoiding the use of spaces in the
string structures for FAT. Unfortunately they do just
that by default in the case of unlabeled filesystems.
Follow the default MS behavior to avoid confusion in
common tools like file(1). This was actually the
default behavior before r203868.
Obtained from: NetBSD (CVS rev. 1.39)
MFC after: 3 days
Most of the tablearg-supported opcodes does not accept 0 as valid value:
O_TAG, O_TAGGED, O_PIPE, O_QUEUE, O_DIVERT, O_TEE, O_SKIPTO, O_CALLRET,
O_NETGRAPH, O_NGTEE, O_NAT treats 0 as invalid input.
The rest are O_SETDSCP and O_SETFIB.
'Fix' them by adding high-order bit (0x8000) set for non-tablearg values.
Do translation in kernel for old clients (import_rule0 / export_rule0),
teach current ipfw(8) binary to add/remove given bit.
This change does not affect handling SETDSCP values, but limit
O_SETFIB values to 32767 instead of 65k. Since currently we have either
old (16) or new (2^32) max fibs, this should not be a big deal:
we're definitely OK for former and have to add another opcode to deal
with latter, regardless of tablearg value.
* Since there seems to be lack of consensus on strict value typing,
remove non-default value types. Use userland-only "value format type"
to print values.
Kernel changes:
* Add IP_FW_XMODIFY to permit table run-time modifications.
Currently we support changing limit and value format type.
Userland changes:
* Support IP_FW_XMODIFY opcode.
* Support specifying value format type (ftype) in tablble create/modify req
* Fine-print value type/value format type.
* Implement proper checks for switching between global and set-aware tables
* Split IP_FW_DEL mess into the following opcodes:
* IP_FW_XDEL (del rules matching pattern)
* IP_FW_XMOVE (move rules matching pattern to another set)
* IP_FW_SET_SWAP (swap between 2 sets)
* IP_FW_SET_MOVE (move one set to another one)
* IP_FW_SET_ENABLE (enable/disable sets)
* Add IP_FW_XZERO / IP_FW_XRESETLOG to finish IP_FW3 migration.
* Use unified ipfw_range_tlv as range description for all of the above.
* Check dynamic states IFF there was non-zero number of deleted dyn rules,
* Del relevant dynamic states with singe traversal instead of per-rule one.
Userland changes:
* Switch ipfw(8) to use new opcodes.
Our mount_nfs does use -o nfsv<2|3|4> or -2 or -3 to specify the version.
OSX (these days), Solaris, and Linux use -o vers=<2,3,4>.
With the upcoming autofs support we can make a lot of (entrerprisy) setups
getting mount options from LDAP just work by providing -o vers= compatibility.
PR: 192379
Reviewed by: wblock, bjk (man page), rmacklem, emaste
MFC after: 3 days
Sponsored by: DARPA,AFRL
Kernel changes:
* Add opcode IP_FW_TABLE_XSWAP
* Add support for swapping 2 tables with the same type/ftype/vtype.
* Make skipto cache init after ipfw locks init.
Userland changes:
* Add "table X swap Y" command.
Kernel changes:
* Add TEI_FLAGS_DONTADD entry flag to indicate that insert is not possible
* Support given flag in all algorithms
* Add "limit" field to ipfw_xtable_info
* Add actual limiting code into add_table_entry()
Userland changes:
* Add "limit" option as "create" table sub-option. Limit modification
is currently impossible.
* Print human-readable errors in table enry addition/deletion code.
* Add "flow:hash" algorithm
Kernel changes:
* Add O_IP_FLOW_LOOKUP opcode to support "flow" lookups
* Add IPFW_TABLE_FLOW table type
* Add "struct tflow_entry" as strage for 6-tuple flows
* Add "flow:hash" algorithm. Basically it is auto-growing chained hash table.
Additionally, we store mask of fields we need to compare in each instance/
* Increase ipfw_obj_tentry size by adding struct tflow_entry
* Add per-algorithm stat (ifpw_ta_tinfo) to ipfw_xtable_info
* Increase algoname length: 32 -> 64 (algo options passed there as string)
* Assume every table type can be customized by flags, use u8 to store "tflags" field.
* Simplify ipfw_find_table_entry() by providing @tentry directly to algo callback.
* Fix bug in cidr:chash resize procedure.
Userland changes:
* add "flow table(NAME)" syntax to support n-tuple checking tables.
* make fill_flags() separate function to ease working with _s_x arrays
* change "table info" output to reflect longer "type" fields
Syntax:
ipfw table fl2 create type flow:[src-ip][,proto][,src-port][,dst-ip][dst-port] [algo flow:hash]
Examples:
0:02 [2] zfscurr0# ipfw table fl2 create type flow:src-ip,proto,dst-port algo flow:hash
0:02 [2] zfscurr0# ipfw table fl2 info
+++ table(fl2), set(0) +++
kindex: 0, type: flow:src-ip,proto,dst-port
valtype: number, references: 0
algorithm: flow:hash
items: 0, size: 280
0:02 [2] zfscurr0# ipfw table fl2 add 2a02:6b8::333,tcp,443 45000
0:02 [2] zfscurr0# ipfw table fl2 add 10.0.0.92,tcp,80 22000
0:02 [2] zfscurr0# ipfw table fl2 list
+++ table(fl2), set(0) +++
2a02:6b8::333,6,443 45000
10.0.0.92,6,80 22000
0:02 [2] zfscurr0# ipfw add 200 count tcp from me to 78.46.89.105 80 flow 'table(fl2)'
00200 count tcp from me to 78.46.89.105 dst-port 80 flow table(fl2)
0:03 [2] zfscurr0# ipfw show
00200 0 0 count tcp from me to 78.46.89.105 dst-port 80 flow table(fl2)
65535 617 59416 allow ip from any to any
0:03 [2] zfscurr0# telnet -s 10.0.0.92 78.46.89.105 80
Trying 78.46.89.105...
..
0:04 [2] zfscurr0# ipfw show
00200 5 272 count tcp from me to 78.46.89.105 dst-port 80 flow table(fl2)
65535 682 66733 allow ip from any to any
Kernel changes:
* s/IPFW_TABLE_U32/IPFW_TABLE_NUMBER/
* Force "lookup <port|uid|gid|jid>" to be IPFW_TABLE_NUMBER
* Support "lookup" method for number tables
* Add number:array algorihm (i32 as key, auto-growing).
Userland changes:
* Support named tables in "lookup <tag> Table"
* Fix handling of "table(NAME,val)" case
* Support printing "number" table data.
restore was failing because ZFS was reporting a blocksize that was
not a multiple of 1024. Replace restore's failed assertion with
code that writes restored files in a blocksize that works for
restore (a multiple of 1024) despite being non-optimal for ZFS.
Submitted by: Dmitry Morozovsky
Tested by: Dmitry Morozovsky
MFC after: 1 week
* Rewrite interface tables to use interface indexes
Kernel changes:
* Add generic interface tracking API:
- ipfw_iface_ref (must call unlocked, performs lazy init if needed, allocates
state & bumps ref)
- ipfw_iface_add_ntfy(UH_WLOCK+WLOCK, links comsumer & runs its callback to
update ifindex)
- ipfw_iface_del_ntfy(UH_WLOCK+WLOCK, unlinks consumer)
- ipfw_iface_unref(unlocked, drops reference)
Additionally, consumer callbacks are called in interface withdrawal/departure.
* Rewrite interface tables to use iface tracking API. Currently tables are
implemented the following way:
runtime data is stored as sorted array of {ifidx, val} for existing interfaces
full data is stored inside namedobj instance (chained hashed table).
* Add IP_FW_XIFLIST opcode to dump status of tracked interfaces
* Pass @chain ptr to most non-locked algorithm callbacks:
(prepare_add, prepare_del, flush_entry ..). This may be needed for better
interaction of given algorithm an other ipfw subsystems
* Add optional "change_ti" algorithm handler to permit updating of
cached table_info pointer (happens in case of table_max resize)
* Fix small bug in ipfw_list_tables()
* Add badd (insert into sorted array) and bdel (remove from sorted array) funcs
Userland changes:
* Add "iflist" cmd to print status of currently tracked interface
* Add stringnum_cmp for better interface/table names sorting
ping6(8) would quit before the remote side gets a chance to respond.
Solve this by resetting the itimer when we have reached the maximum packet
number have reached, but let the other handling to continue.
PR: bin/151023
Submitted by: tjmao at tjmao.net
MFC after: 2 weeks
* Add resize callbacks for upcoming table-based algorithms.
Kernel changes:
* s/ipfw_modify_table/ipfw_manage_table_ent/
* Simplify add_table_entry(): make table creation a separate piece of code.
Do not perform creation if not in "compat" mode.
* Add ability to perform modification of algorithm state (like table resize).
The following callbacks were added:
- prepare_mod (allocate new state, without locks)
- fill_mod (UH_WLOCK, copy old state to new one)
- modify (UH_WLOCK + WLOCK, switch state)
- flush_mod (no locks, flushes allocated data)
Given callbacks are called if table modification has been requested by add or
delete callbacks. Additional u64 tc->'flags' field was added to pass these
requests.
* Change add/del table ent format: permit adding/removing multiple entries
at once (only 1 supported at the moment).
Userland changes:
* Auto-create tables with warning
variants. This allows usable file system images (i.e. those with both a
shell and an editor) to be created with only one copy of the curses library.
Exp-run: antoine
PR: 189842
Discussed with: bapt
Sponsored by: DARPA, AFRL
The free space value in the FSInfo block is merely unitialized when it is
0xffffffff. This fixes a bug found in NetBSD.
It must be noted that we never supported all the checks that NetBSD does
as some of them would cause failures with a freshly created FAT32
from MS-Windows.
While here, bring some space fixes.
Obtained from: NetBSD (rev. 1.22)
MFC after: 3 days
* Switch kernel to use per-cpu counters for rules.
* Keep ABI/API.
Kernel changes:
* Each rules is now exported as TLV with optional extenable
counter block (ip_fW_bcounter for base one) and
ip_fw_rule for rule&cmd data.
* Counters needs to be explicitly requested by IPFW_CFG_GET_COUNTERS flag.
* Separate counters from rules in kernel and clean up ip_fw a bit.
* Pack each rule in IPFW_TLV_RULE_ENT tlv to ease parsing.
* Introduce versioning in container TLV (may be needed in future).
* Fix ipfw_cfg_lheader broken u64 alignment.
Userland changes:
* Use set_mask from cfg header when requesting config
* Fix incorrect read accouting in ipfw_show_config()
* Use IPFW_RULE_NOOPT flag instead of playing with _pad
* Fix "ipfw -d list": do not print counters for dynamic states
* Some small fixes
This includes:
o All directories named *ia64*
o All files named *ia64*
o All ia64-specific code guarded by __ia64__
o All ia64-specific makefile logic
o Mention of ia64 in comments and documentation
This excludes:
o Everything under contrib/
o Everything under crypto/
o sys/xen/interface
o sys/sys/elf_common.h
Discussed at: BSDcan
Kernel changes:
* Change dump format for dynamic states:
each state is now stored inside ipfw_obj_dyntlv
last dynamic state is indicated by IPFW_DF_LAST flag
* Do not perform sooptcopyout() for !SOPT_GET requests.
Userland changes:
* Introduce foreach_state() function handler to ease work
with different states passed by ipfw_dump_config().
* Bump table dump format preserving old ABI.
Kernel size:
* Add IP_FW_TABLE_XFIND to handle "lookup" request from userland.
* Add ta_find_tentry() algorithm callbacks/handlers to support lookups.
* Fully switch to ipfw_obj_tentry for various table dumps:
algorithms are now required to support the latest (ipfw_obj_tentry) entry
dump format, the rest is handled by generic dump code.
IP_FW_TABLE_XLIST opcode version bumped (0 -> 1).
* Eliminate legacy ta_dump_entry algo handler:
dump_table_entry() converts data from current to legacy format.
Userland side:
* Add "lookup" table parameter.
* Change the way table type is guessed: call table_get_info() first,
and check value for IPv4/IPv6 type IFF table does not exist.
* Fix table_get_list(): do more tries if supplied buffer is not enough.
* Sparate table_show_entry() from table_show_list().