This post is inspired by another blog post called “How do I modify a variable in Haskell?” by Michael Burge.

It got me thinking about different methods of printing Identity Matrices in Bash (because I am not smart enough to do it in Haskell), and because Rosetta Code was missing a Bash solution. (Update from 9/17/2017: they accepted my solution)

But first off, what is an Identity Matrix?

A square matrix in which all the main diagonal elements are 1’s and all the remaining elements are 0’s is called an Identity Matrix. Source

And here is an example (Source):

Alright, let’s get to generating them, or you can jump to my final solution

I usually reach for `bc`

if I need to do anything somewhat complex arithmetically in Bash. So let’s try that first.

With this script, I am going to try to do things “right” and not code-golf yet. It will also default to 10 if no argument is passed.

You can try it online or here it is:

```
#!/bin/bash
max="${*:-10}";
for i in $(seq "$max"); do
echo "obase=2; $(( 1 << max - i))"\
| bc \
| xargs printf "%0"$max"d\\n";
done
```

This isn’t super complicated, but I will still break it down:

```
max="${*:-10}";
```

Pulls in all arguments, and defaults to 10 if none are provided.

```
for i in $(seq "$max"); do
```

Just generate the `seq`

uence of 1 to the target number.

```
echo "obase=2; $(( 1 << max - i))" | bc
```

This is where it gets fun. The `obase=2`

tells `bc`

that we want base 2 instead of base 10 since we are doing bitwise changes next.

Then we do a left bitwise shift of the first argument minus the current position which gives us the numbers we need:

```
1000000000
100000000
10000000
1000000
100000
10000
1000
100
10
1
```

But we are missing the padding to make it a true Identity Matrix.

```
xargs printf "%0"$max"d\\n";
```

Pipes the output of `bc`

to `printf`

who uses its `%Wd`

syntax to pad the output to `W`

characters long, in this case the first argument, and then an explicitly escaped newline.

Which finally gives us what we want:

```
1000000000
0100000000
0010000000
0001000000
0000100000
0000010000
0000001000
0000000100
0000000010
0000000001
```

This is kind of wordy though and has a lot of drawbacks. For example:

```
$ ./bcIdentityMatrix 20
### Hundreds of Lines ###
printf: ‘10000000000000000000’: Numerical result out of range
```

Ouch. Looks like we are running into our system’s `LONG_MAX`

of 9223372036854775807 which only gives us a maximum of a 19 digit diagonal matrix. Not very versatile at all.

I probably won’t be able to escape that limit while still actually “calculating” the matrix. But I can get rid of our reliance on `bc`

, so let’s try a new method without it.

I am getting bored, so let’s code golf this a little, and let’s switch to using the powers of 10 to find the next number instead of bitwise shifting since that gave me headaches:

```
for i in `seq $(($1-1))|tac`; do printf "%0$1d\n" "$((10**i))";done
```

Not as clear as the last one, but actually simpler in a lot of ways:

```
for i in `seq $(($1-1))|tac`; do
```

Generate the `seq`

uence of 1 to the first argument minus 1. Then reverse it with it `tac`

.

A couple notes:

- I used the old method of backticks instead of the new
`$()`

because it saves me a character. Please don’t do that. - It would be tempting to try to use
`rev`

, but`tac`

is actually much better at interpreting the input from`seq`

as a list. The more “standard” solution to this problem is probably`seq $1 -1 0`

to count down. - I wasted a ton of characters with the
`-1`

part because otherwise the matrix will be off by one at the top like so:`10000000000 1000000000 0100000000 0010000000 0001000000 0000100000 0000010000 0000001000 0000000100 0000000010`

It would probably be faster to have `sed`

delete the first line if I had already used it somewhere else. Moving on:

```
printf "%0$1d\n" "$((10**i))"
```

Same general idea as the `printf`

of my first script. Pad the output of 10, to the power of the current integer (`$((10**i))`

), to the width of the first argument (`%0$1d\n`

).

Alas, this suffers the same fate of my last script:

```
$ ./exponentSolution 20
-8446744073709551616
01000000000000000000
00100000000000000000
00010000000000000000
00001000000000000000
00000100000000000000
00000010000000000000
00000001000000000000
00000000100000000000
00000000010000000000
00000000001000000000
00000000000100000000
00000000000010000000
00000000000001000000
00000000000000100000
00000000000000010000
00000000000000001000
00000000000000000100
00000000000000000010
```

At least we got rid of `bc`

though. Let’s try faking the calculation.

We will be stuck on a 19 digit matrix unless we can trick Bash into treating the output as a string, and “calculating” it using a preset number to fill in the 1’s as needed:

```
for i in `seq $1`;do printf '%*s\n' $1|tr ' ' '0'|sed "s/0/1/$i";done
```

So let’s break this down:

```
for i in `seq $1`;do
```

Pretty simple. Generate the `seq`

uence of numbers 1, to the value of our first argument.

```
printf '%*s\n' $1|tr ' ' '0'
```

Tell `printf`

to generate a string that is as long as my first argument. Then `tr`

replaces it with a `0`

. (Some might know this from another blog post). This is what the output looks like right now:

```
0000000000
0000000000
0000000000
0000000000
0000000000
0000000000
0000000000
0000000000
0000000000
0000000000
```

Missing some vital `1`

characters. Let’s fix that:

```
sed "s/0/1/$i"
```

Switches a `0`

with a `1`

in the position that the current loop `$i`

is in. And then it generates our matrix as expected:

```
$ ./dirtysed 10
1000000000
0100000000
0010000000
0001000000
0000100000
0000010000
0000001000
0000000100
0000000010
0000000001
```

As always, /u/galaktos proposed a much cleaner and effective version of my code. I should probably just start sending them my blog before I post it :)

Here is their reproduced solution:

I wouldn’t even think of doing all that bit shifting or multiplication… here’s my take:

`for i in {1..20}; do for j in {1..20}; do printf $((i==j)); done; printf '\n'; done`

Or, in more lines:

`for i in {1..20}; do for j in {1..20}; do printf $((i==j)) done printf '\n' done`

If you don’t like the implicit boolean-to-int conversion, you can also use

`$((i==j?1:0))`

. Or`$((i==j ? 1 : 0))`

if you’re not golfing.Unfortunately, brace expansion happens before parameter and variable expansion, so just substituting 20 for

`$n`

doesn’t work if you want to make the size variable. But we can use Bash’s alternative for syntax for that:`n=${1:-10} for ((i=0;i<n;i++)); do for ((j=0;j<n;j++)); do printf $((i==j)) done printf '\n' done`

Oh, and shellcheck would prefer

`printf '%s' $((i==j))`

instead of`printf $((i==j))`

. Fair enough. (Interestingly, it doesn’t complain about the unquoted`$(())`

– it seems to know that arithmetic expansion can never produce multiple words. But unless I’m missing something, it can’t produce a`%`

sign that would be interpreted in a format string either, can it?)

The above was already a huge improvement on my loops, but they bested themselves again with a faster version:

Oh, I missed the part where we’re timing this. Here’s a slightly faster version:

`#!/bin/bash n=${1:-10} for ((i=0;i<n;i++)); do line= for ((j=0;j<n;j++)); do line+=$((i==j)) done printf '%s\n' "$line" done`

But it’s still fairly slow. But hey, it’s pure bash ¯\_(ツ)_/¯

I was also beaten on the `bc`

front with /u/ray_gun’s inventive `bc`

and `dc`

methods:

Another way of getting repeated 0s is multiplying a power of 10 by a number and removing that number with

`sed`

later (here, by 2):`time BC_LINE_LENGTH=0 bc <<< 'r=10000; for ( a = 0; a < r; a++ ) { print 2 * 10^a, 1, 2 * 10^(r - a - 1), "\n" }' | sed 's/2//g' real 0m6.236s user 0m6.415s sys 0m0.476s`

6 seconds instead of the author’s 1 minute and 28 seconds.

Here’s a faster one (ab)using

`bc'`

s`scale`

and using`sed`

to remove the leading dot:`time BC_LINE_LENGTH=10050 bc <<< 'scale=10000; 10^-1; for ( a = 0; a < scale-1; a++ ) last/10' | sed 's/^\.//' real 0m1.855s user 0m1.903s sys 0m0.382s`

`bc`

and`dc`

can also check if the number they just calculated became 0 because internally it won’t be more precise than`scale`

, e.g. with`dc`

:`time DC_LINE_LENGTH=10050 dc <<< '10000k [10/ d0=q p ldx]Sd [q]Sq 1 ldx' | sed 's/^\.//' real 0m1.719s user 0m1.716s sys 0m0.404s`

Edit:

`bc`

equivalent of the above, it needs`sed`

to remove the last line (0):`time BC_LINE_LENGTH=0 bc <<< 'scale=10000; 10^-1; while ( last != 0 ) last/10' | sed '$d;s/^\.//' >/dev/null real 0m1.504s user 0m1.523s sys 0m0.066s`

Using a very unscientific method, let’s benchmark all the solutions to 10,000 digits on my machine. The methods that can’t go above 20 are obviously disqualified, so let’s test the others:

```
$ time for i in `seq 10000`;do printf '%*s\n' 10000|tr ' ' '0'|sed "s/0/1/$i";done
real 1m31.974s
user 0m29.209s
sys 0m4.848s
```

Ouch. Not looking great so far.

```
$ time ./purebash 10000
real 15m50.447s
user 7m20.544s
sys 0m0.421s
```

Slower, but it ** is** pure

`bash`

so it gets extra points for that. I may have also needed to test it in a different way to be fair.```
$ time BC_LINE_LENGTH=0 bc <<< 'scale=10000; 10^-1; while ( last != 0 ) last/10' | sed '$d;s/^\.//'
real 0m17.071s
user 0m2.009s
sys 0m0.340s
```

Much faster than any other method! So use this one if you need to quickly generate huge matrices.

This was an interesting look at just how many methods there are for generating matrices. I especially enjoyed all the community feedback (thanks /r/bash and #bash), and I learned a ton from it.