# sed(1) RE error: brackets ([ ]) not balanced



## pdefriesse (Jan 2, 2022)

I usually find I missed something so figured I'd check with y'all before filing a bug report on this.  I used sed(1) to look for unprintable chars in text files on FreeBSD version 11 using the following:

`sed -n '/[^\x00-\x7F]/p' test.txt`

which worked fine.  File test.txt is a random block of text which I added the characters `\x80\x92` with vi.  GNU sed works as expected, finds the "bad" line.  Also tried re-compiling sed(1) from the source tree and got the same "error".

Installed FreeBSD XXXXX 12.2-RELEASE FreeBSD 12.2-RELEASE r366954 GENERIC  i386 and sed(1) started failing thus:

```
sed: 1: "/[^\x00-\x7F]/p": RE error: brackets ([ ]) not balanced
```
Release notes didn't say anything about regex in sed(1).  

Did I miss a compliance modification regarding hex constants and/or regex?

test.txt attached


----------



## eternal_noob (Jan 2, 2022)

pdefriesse said:


> Did I miss a compliance modification regarding hex constants and/or
> regex?


The regex looks fine to me. I don't know what's wrong.


----------



## covacat (Jan 2, 2022)

the \x00 causes the problem
\x01 works
\0 probably causes a string termination internally and fscks things up


----------



## pdefriesse (Jan 6, 2022)

covacat said:


> the \x00 causes the problem
> \x01 works
> \0 probably causes a string termination internally and fscks things up


Very interesting.  Question is whether that is a bug or a "feature" and should report it?  It looks like \00 is being passed as [NUL]

My solution?  Replace sed with gnu sed because IMHO sed is broken.


----------



## covacat (Jan 6, 2022)

its most likely a bug


----------



## eurohick2 (Jan 19, 2022)

?  -E      Interpret regular expressions as extended (modern) regular expressions rather than basic regular expressions (BRE's)

you shouldn't expect BSD 'sed' to behave like linux.  read the manpage for BSD's.  it looks like to me  \x  is "a context adress" in the context you have it in.


----------



## eurohick2 (Jan 19, 2022)

/*
 *  read stdin, output only characters who's 8th bit is 0
/*

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main (int argc, char **argv)
{
  FILE *file = NULL ;
  int x;

  char help[] = "USAGE: 7bitclean [-h | -v | file] < file";

  if( argc == 2 && ( strcmp(argv[1], "-h")==0 || strcmp(argv[1], "-v")==0 ) )
  {
    puts(help);
    return 0;
  }

  if( argc == 2 )
  {
    file = fopen(argv[1], "r");
    if( file == NULL ) return 1;
  }
  else
  {
    file = stdin;
    if( file == NULL ) return 1;
  }

  while( 1 )
  {
    x = fgetc(file);
    if( x == EOF ) return 0;
    if( x >= 128) x = ' ';
    putchar((char) x);
  }

return 0;
}


----------

