# BSD sed command does an unexpected behavior



## richmikan (Jan 28, 2013)

I found an unexpected behavior about the BSD sed command.

Try the following command on your FreeBSD host and you will probably see the following response.

```
$ seq 1 10 | awk '{print $1 ",A"}' | sed '3,4N; s/\n/-/g'"]
1,A
2,A
3,A-4,A
5,A-6,A
7,A-8,A
9,A
10,A
$
```

However the GNU sed command, which is available by textproc/gsed (ports) or
on a Linux host, returns a different response. That is as follows,

```
$ seq 1 10 | awk '{print $1 ",A"}' | gsed '3,4N; s/\n/-/g'"]
1,A
2,A
3,A-4,A
5,A
6,A
7,A
8,A
9,A
10,A
$
```

I don't know why the two versions of sed commands return different responses. I suspect that the GNU sed works correctly. Because the sed "3,4N" orders to concatenate with the next line, only from the line #3 to the line #4, but *NOT TO* line #5. 

The BSD sed has something wrong? Or that is just my misunderstanding?


----------



## graudeejs (Jan 28, 2013)

Just to note, you have extra 2 characters at the end of each command line (this is not the problem with sed, just a typo).


----------



## richmikan (Jan 28, 2013)

Oh, sorry! Those("]) are typos.:r
Thank you for your indicating.
Your indication is certainly true. 


But the unexpected behavior is not going to go away by erasing the typo characters.
Don't you know the reason?


----------



## graudeejs (Jan 28, 2013)

Personally I don't know. But different implementations might have different behavior.
I think you should ask on @stable mailinglist.

Currently It looks like a bug, but I'm not that much into sed.


----------



## J65nko (Jan 28, 2013)

On OpenBSD the output is as follows:

```
[cmd=$]jot 10 1 |  awk '{print $1 ",A"}' | sed '3,4N; s/\n/-/g'[/cmd]  
1,A
2,A
3,A-4,A
5,A-6,A
7,A-8,A
9,A-10,A
```


----------



## richmikan (Jan 28, 2013)

Thank you for your advise.
You also think that looks like a bug, don't you?

I will ask on the mailinglist.
Thanks again.


----------



## richmikan (Jan 28, 2013)

J65nko said:
			
		

> On OpenBSD the output is as follows:
> 
> ```
> [cmd=$]jot 10 1 |  awk '{print $1 ",A"}' | sed '3,4N; s/\n/-/g'[/cmd]
> ...



That confuses me furthermore!
Hmm, isn' it a complex problem?


----------



## richmikan (Jan 29, 2013)

Thanks to qpatrick

The sed command on the AIX works as is expected.


----------



## mvatten (Jan 29, 2013)

Just to add to the data: plan9port's sed gives the same result as the GNU sed here.

Mark.


----------



## _martin (Jan 29, 2013)

Hm, can't comment on the sed part , but for the sake of comparison I'm attaching results from HPUX (all 11i versions - 11.11/11.23/11.31): 

`# printf "1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n" | awk '{print $1 ",A"}' | sed '3,4N; s/\n/-/g'`

```
1,A
2,A
3,A-4,A
5,A
6,A
7,A
8,A
9,A
10,A
```
There's no seq nor jot in hpux, so I had to improvise.


----------



## fonz (Jan 29, 2013)

Just to pitch in: it could be a bug, or it could be a difference in semantics between GNU sed and BSD sed. Does the man page provide any hints as to what is expected BSD sed behaviour in this case?


----------



## richmikan (Jan 30, 2013)

Thank you for reporting, everyone.

I also report the other implement of sed.

sed on AIX 6.1.0.0 
sed on HP-UX B.11.23 
sed on SunOS 5.9(Solaris 9)
They all return the same responses as the GNU sed. Those implementations are probably different from GNU's.

> matoatlantis

How about the following command for the OSs with neither seq nor jot.
`$ yes A | head -n 10 | awk '{print NR "," $1}' | sed '3,4N; s/\n/-/g'`


----------



## throAU (Jan 30, 2013)

Different from GNU is not a bug.

What does the FreeBSD manpage say the FreeBSD behaviour should be?


----------



## graudeejs (Jan 30, 2013)

Isn't sed in Solaris 9 the same as GNU sed? I ask, because I know there is GNU stuff on Solaris (on newer versions). Don't know about other Unixes.


----------



## _martin (Jan 30, 2013)

richmikan said:
			
		

> How about the following command for the OSs with neither seq nor jot.
> `$ yes A | head -n 10 | awk '{print NR "," $1}' | sed '3,4N; s/\n/-/g'`



`# yes A | head -n 10 | awk '{print NR "," $1}' | sed '3,4N; s/\n/-/g'`

```
1,A
2,A
3,A-4,A
5,A
6,A
7,A
8,A
9,A
10,A
```

Output is from 11.31 as other releases had the same output. 
I highly doubt sed in HPUX is GNU sed. But according to docs it follows following standards: 


```
STANDARDS CONFORMANCE
      sed: SVID2, SVID3, XPG2, XPG3, XPG4, POSIX.2
```

On solaris 10 you can choose different sed depending on standard: 

`# printf "1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n" | awk '{print $1 ",A"}' | /usr/bin/sed '3,4N; s/\n/-/g'`

```
1,A
2,A
3,A-4,A
5,A
6,A
7,A
8,A
9,A
10,A
```
Output is the same even with using /usr/xpg4/bin/sed instead.


----------



## J65nko (Jan 30, 2013)

The gsed man page at http://www.freebsd.org/cgi/man.cgi?....0-RELEASE+and+Ports&arch=default&format=html describes the N command as follows:

```
n N    Read/append the next line of input into the pattern space.
```
The FreeBSD description from sed(1):

```
[2addr]N
	     Append the next line of input to the pattern space, using an
	     embedded newline character to separate the appended material from
	     the original contents.  [color=blue]Note that the current line number
	     changes.[/color]
```

The OpenBSD man page also has this note.


----------



## fonz (Jan 30, 2013)

So, apparently what we have here is a documented semantic difference between GNU sed and BSD sed.


----------



## mvatten (Jan 30, 2013)

But the man page of plan9port sed states the same, while not giving the same result as FreeBSD sed:


```
`N    Append the next line of input to the pattern
      space with an embedded newline.  (The current
      line number changes.)'
```

Mark.


----------



## richmikan (Jan 31, 2013)

Thans for everyone, again.

I suppose that...
Even if the behavior of the BSD sed is not a bug but a semantic,
I can't concretely understand and explain the reason of the behaior.

I don't know why
"3,4N" suggests "3,A-4,A" and "5,A-6,A", "7,A-8,A"
while "3,5N" suggests only "3,A-4,A",
on the FreeBSD sed.


----------



## J65nko (Jan 31, 2013)

sed(1) works on lines. And a line is a sequence of non-linefeed characters followed by a linefeed character ("\n"). The example code that we have been looking at, "messes around"  with that critical linefeed. We replace it with a "-":


```
3,4 {
N
s/\n/-/
}
```
So we change the marker that defines the chunk of data sed(1) is working with.

In the following attempts I use this text file:

```
[cmd=$] cat 1-10a.txt[/cmd]
1,A
2,A
3,A
4,A
5,A
6,A
7,A
8,A
9,A
10,A
```

The sed(1) command file:

```
[cmd=$] cat cmd4.sed[/cmd]
3,4 {
H
}

5 {
x
s/\n/-/g
}
```
Lines 3-4 are transferred from pattern space to *H*old space. At line 5 we swap *H*old space and pattern space, and substitute the newline with the hyphen.


```
[cmd=$]sed -f cmd4.sed 1-10a.txt[/cmd]
1,A
2,A
3,A
4,A
3,A-4,A
6,A
7,A
8,A
9,A
10,A
```
Now line 3-4 are still being displayed and line 5 is missing.
An ugly hack is instruct sed(1) not to echo the lines with the *-n* option. and explicitly to use *p* to print: 


```
[cmd=#] cat cmd5.sed[/cmd]                            
1,2 {
p
}

3,4{
H
}

5 {
x
s/\n/-/g
p
x
p
}

6,10 {
p
}
```
Yes, it is ugly, but produces the wanted output:


```
[cmd=$]sed -nf cmd5.sed 1-10a.txt[/cmd] 
1,A
2,A
3,A-4,A
5,A
6,A
7,A
8,A
9,A
10,A
```


----------



## throAU (Feb 1, 2013)

richmikan said:
			
		

> Thans for everyone, again.
> 
> I suppose that...
> Even if the behavior of the BSD sed is not a bug but a semantic,
> I can't concretely understand and explain the reason of the behaior.



Because GNU wrote gsed afterwards, and changed the behaviour.


----------

