# awk's rand() not really random?



## jrick (Aug 21, 2009)

I'm trying to select a random item from a list with awk, but I'm finding that awk's rand() feature isn't quite working as expected.


```
% awk 'BEGIN{ srand(); print rand(); }'
0.889922
% awk 'BEGIN{ srand(); print rand(); }'
0.88993
% awk 'BEGIN{ srand(); print rand(); }'
0.88993
% awk 'BEGIN{ srand(); print rand(); }'
0.889938
% awk 'BEGIN{ srand(); print rand(); }'
0.889938
% awk 'BEGIN{ srand(); print rand(); }'
0.889946
% awk 'BEGIN{ srand(); print rand(); }'
0.889946
% awk 'BEGIN{ srand(); print rand(); }'
0.889954
% awk 'BEGIN{ srand(); print rand(); }'
0.889962
% awk 'BEGIN{ srand(); print rand(); }'
0.889962
% awk 'BEGIN{ srand(); print rand(); }'
0.889969
% awk 'BEGIN{ srand(); print rand(); }'
0.889969
```

...which isn't really random.

Am I doing something wrong? Is there a different way I should be using this?


----------



## anomie (Aug 21, 2009)

I'm not sure if this will help or not (because the documentation is for - and I am using - gawk), but: 

http://www.gnu.org/software/gawk/manual/html_node/Numeric-Functions.html#Numeric-Functions

In your example, if you're re-running the command at a quick pace I suppose the seed (based on date/time if one is not explicitly provided) could be very similar. 

Here's an example session I just tried, waiting ~4 seconds between each press of Enter: 

```
$ awk '{ srand() ; print rand() }'

0.669459

0.291102

0.614256

0.286518

0.362619

0.485851
^C
```


----------



## anomie (Aug 21, 2009)

Oui, I just confirmed. Look what happens if I hit Enter rapidly: 

```
$ awk '{ srand() ; print rand() }'

0.372023

0.372023

0.372023

0.372023

0.372023

0.602575

0.602575

0.602575
^C
```

So you are probably going to need to either provide a seed or slow things down a bit.


----------



## jrick (Aug 21, 2009)

Well, I have it kind of working with gawk, but I would really prefer to have a solution using awk from base for portability.


```
gawk 'BEGIN { srand(systime() + PROCINFO["pid"]); print rand() }'
```

(taken from here)

Unfortunately, systime() is only provided with gawk. Any good ideas on how to do this with /usr/bin/awk?


----------



## anomie (Aug 21, 2009)

This time with awk:

```
$ awk --version
awk version 20070501 (FreeBSD)
```

OK for a single run: 

```
$ _seed=`date +%s` ; awk '{ srand('${_seed}') ; print rand()}'
```

Each time that entire command is invoked, you'll have Epoch seconds assigned to _seed.


----------



## jrick (Aug 21, 2009)

Awesome, thanks. But now it seems like awk doesn't respect PROCINFO like gawk does.


```
% gawk 'BEGIN{ print PROCINFO["pid"]}'
10580
% awk 'BEGIN{ print PROCINFO["pid"]}'
```

According to the gawk documentation, PROCINFO["pid"] returns the process ID of the current process.  Any idea how to get something like this working?


----------



## anomie (Aug 22, 2009)

Dunno. All I can think of is to use the Bourne shell variable to get its pid (assuming you're going to be running this from Bourne shell / or from a Bourne shell script). 

```
$ awk '{ print '$$' }'

228
```


----------



## Alt (Aug 24, 2009)

I tried following construct and seems it works ok


> awk 'BEGIN { srand(); } { print rand() }'


----------



## jrick (Aug 24, 2009)

Alt said:
			
		

> I tried following construct and seems it works ok



If you run it just once, yes.  If you run it multiple times in a row, the "random" numbers aren't really all that random.  This is because the seed that srand() uses is the system time.

I also found out that this whole thread seems kind of pointless, since srand() actually takes no arguments like the srand() of gawk.  Even though I can pass it these same variables as for gawk, it doesn't make any difference.


----------



## Alt (Aug 24, 2009)

Then that be much harder huh 
Next version is


> awk -v rn=`jot -r 1 1 10000000` "BEGIN { srand(srand()+rn); } { print rand(); }"



Tested with


> echo "Test" | awk -v rn=`jot -r 1 1 10000000` "BEGIN { srand(srand()+rn); } { print rand(); }" ; echo "Test" | awk -v rn=`jot -r 1 1 10000000` "BEGIN { srand(srand()+rn ); } {print rand(); }"


----------



## jrick (Aug 24, 2009)

Wow, that works much nicer.  How exactly does that work, and how are you able to give arguments to srand()?


----------



## Alt (Aug 24, 2009)

```
awk -v rn=`jot -r 1 1 10000000` "BEGIN { srand(srand()+rn); } { print rand(); }"
```

jot -r 1 1 10000000  - gives random number from 1 to 10000000 =)
-v rn=....  - this inserts variable to awk interpreter
srand() - really it takes seed number and returns old seed. So, cus "old seed" is systime, we just add an 'rn' variable

UPD: it can be simplified to 

```
awk "BEGIN { srand(srand()+`jot -r 1 1 10000000`); } {print rand(); }"
```


----------



## ephemera (Aug 24, 2009)

I think the problem is that srand() is being called for every call to rand(). When srand(3) is initialized with the same value the psuedo random number sequence is repeated (awk's srand() is probably implemented as srand(time(0)).).

Try: awk 'BEGIN {srand()} {print rand()}'


----------



## bigearsbilly (Aug 28, 2009)

aye,
you should call srand at the beginning, once. not
every time.


----------

