'y' does not handle bracket expressions, treat '[' as ordinary character
and do not apply bracket expression checks (GNU sed agrees).
PR: 247931
Reviewed by: pfg, kevans
Tested by: antoine (exp-run), Quentin L'Hours <lhoursquentin@gmail.com>
Differential Revision: https://reviews.freebsd.org/D25640
Somewhat predictably, software often wants to use \x27/\x24 among others so
that they can decline worrying about ugly escaping, if said escaping is even
possible. Right now, this software is using these and getting the wrong
results, as we'll interpret those as x27 and x24 respectively. Some examples
of this, when an exp-run was ran, were science/octopus and misc/vifm.
Go ahead and process these at all times. We allow either one or two digits,
and the tests account for both. If extra digits are specified, e.g. \x2727,
then the third and fourth digits are interpreted literally as one might
expect.
PR: 229925
MFC after: 2 weeks
This is both reasonable and a common GNUism that a lot of ported software
expects.
Universally process \r, \n, and \t into carriage return, newline, and tab
respectively. Newline still doesn't function in contexts where it can't
(e.g. BRE), but we process it anyways rather than passing
UB \n (escaped ordinary) through to the underlying regex engine.
Adding a --posix flag to disable these was considered, but sed.1 already
declares this version of sed a super-set of POSIX specification and this
behavior is the most likely expected when one attempts to use one of these
escape sequences in pattern space.
This differs from pre-r197362 behavior in that we now honor the three
arguably most common escape sequences used with sed(1) and we do so outside
of character classes, too.
Other escape sequences, like \s and \S, will come later when GNU extensions
are added to libregex; sed will likely link against libregex by default,
since the GNU extensions tend to be fairly un-intrusive.
PR: 229925
Reviewed by: bapt, emaste, pfg
Differential Revision: https://reviews.freebsd.org/D22750
The fix is only partial and causes an asymmetry which breaks a test in
multi_test.sh.
We should consider both parts of the issue found in OpenBSD[1], but for now
just revert the change.
[1] http://undeadly.org/cgi?action=article;sid=20180728110010
Reported by: asomers
We don't generally support the weird case of regular expresions delimited
by an opening square bracket ('[') but POSIX says that inside
bracket expressions, escaping is not possible and both '[' and '\'
represent themselves.
PR: 230198 (exp-run)
Obtained from: OpenBSD
Mainly focus on files that use BSD 3-Clause license.
The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.
Renumber cluase 4 to 3, per what everybody else did when BSD granted
them permission to remove clause 3. My insistance on keeping the same
numbering for legal reasons is too pedantic, so give up on that point.
Submitted by: Jan Schaumann <jschauma@stevens.edu>
Pull Request: https://github.com/freebsd/freebsd/pull/96
In r297602, which included a __FreeBSD_version bump to 1100105, we changed
sed 'i' and 'a' from discarding whitespaces to conform with what GNU and
sysvish sed do.
There are arguments in favor of keeping the old behavior but the new
behavior is also useful for migration purposes. It seems important to at
least consider the case of developers depending on the previous behavior,
so add a CFLAG to enable the old behaviour.
PR: 213474
MFC after: 5 days
While big, the change was meant to have no effect on behavior and instead
so far we have found two regressions: one in the etcupdate tests and
another one in the games/openttd port[1].
Revert to a known working state. We will likely have to split the patch in
functional parts before bringing back the changes.
PR: 195929
Reported by: danfe, madpilot [1]
This appears to be implementation dependent but convenient and makes
our sed behave more like GNU sed.
Given that it is not the historic behavior, bump FreeBSD_version
should userland/ports somehow depend on it.
Obtained from: NetBSD (bin/49872)
Reviewed by: bdrewery
PR: 208554
Merge after: NEVER
"The escape sequence '\n' shall match a <newline> embedded in
the pattern space."
It is unclear whether this also applies to a \n embedded in a
character class. Disable the existing handling of \n in a character
class following Mac OS X, GNU sed version 4.1.5 with --posix, and
SunOS 5.10 /usr/bin/sed.
Pointed by: Marius Strobl
Obtained from: Mac OS X
of the y (translate) command.
"If a backslash character is immediately followed by a backslash
character in string1 or string2, the two backslash characters shall
be counted as a single literal backslash character"
Pointed by: Marius Strobl
Obtained from: Mac OS X
specification and regression test regress:25.
"A function can be preceded by one or more '!' characters, in which
case the function shall be applied if the addresses do not select
the pattern space."
MFC after: 2 weeks
parenthesized subexpression is defined. For example, the
following command line caused unexpected behavior like
segmentation fault:
% echo test | sed -e 's/test/\1/'
PR: bin/126682
MFC after: 1 week
1) Add missing parens around assignment that is compared to zero.
2) Make some variables that only take non-negative values unsigned.
3) Some casts/type changes to fix other constness warnings.
4) Make one variable a const char *.
5) Make sure termwidth is positive, it doesn't make sense for it to be negative.
Approved by: dds
whether we should ignore case, determine the flag by calling
compile_flags() first. Also, make sure that we obtain an
initialized cmd->u.s buffer before processing further. We
may want to refine this solution later, but for now, make
the changes in order to unbreak world build after a sed(1)
with rev. 1.29 of compile.c is installed.
Approved by: re (hrs)
was initiated at the last character of the line buffer, the Wrong
Thing was done and sed barfed by interpreting the following NUL byte
as a digit. Instead, pull up the next buffer and record that the "\"
was last seen.
substitution expressions in the form `s,[fooexp],[barexp],;...' treated
as invalid when the third `,' is (_POSIX2_LINE_MAX * N)-th character in
the line.
MFC after: 2 weeks
an ``a'' command that has an escaped newline on the
last line of the last script that we're processing.
This fixes exmh2/scripts/build when /etc/malloc.conf -> AJ
The fundamental problem with the original code is that it accesses
p[-2] which is one before the beginning of the input buffer for
empty lines. rev.1.6 just moved the problem from failures when
p[-2] happens to be '\\' to failures when it happens to be '\0'.
rev.1.5 was confused about the trailing newline and other things.
I went back to rev.1.5 and fixed it. The result is the same as
Keith Bostic's final version in PR 1356 except it loses more
gracefully for excessively long input lines.