Justifying Text

Justifying Text#

In this notebook we will take a look at how to write some array-oriented APL code that justifies some text. In case you are wondering what “justified text” is, take a look at the example in the Wikipedia.

(The content of this notebook draws heavily from Lesson 42 of the APL Cultivation chat lessons and is a written follow-up to Rodrigo Girão Serrão’s presentation at APL Seeds’21.)

The Text We’ll Justify#

While we prototype our solution that justifies some text, let us define some text to play around with:

⎕← text ← 'To APL, or not to APL, that is' 'the question: whether ''tis' 'nobler in the mind to suffer' 'the slings and arrows of the' 'outrageous fortune...'

┌──────────────────────────────┬──────────────────────────┬────────────────────────────┬────────────────────────────┬─────────────────────┐ │To APL, or not to APL, that is│the question: whether 'tis│nobler in the mind to suffer│the slings and arrows of the│outrageous fortune...│ └──────────────────────────────┴──────────────────────────┴────────────────────────────┴────────────────────────────┴─────────────────────┘

Now we have a vector with 5 elements, where each box represents one of the lines of the text we want to justify:

≢text

5

Data Format#

However, instead of dealing with a vector of “strings” – which in APL we call character vectors – it would be more useful to work with the text data in a different format:

↑text

To APL, or not to APL, that is the question: whether 'tis nobler in the mind to suffer the slings and arrows of the outrageous fortune...

If we work with the text in this format, it looks more like printed text on a page and it also becomes easier to develop our solution. With that in mind, let us redefine our text variable to hold the text like this:

text ← ↑text

We can look at ↑ (which is called mix) as a function that is stacking up the vectors that compose text.

The resulting data we have is now a matrix: a two-dimensional structure composed of characters. The way you check text is a matrix is by asking APL to tell you the shape of text:

⍴text

5 30

⍴ is called shape and it tells you that text has 5 rows and 30 columns…

But that is certainly interesting, because some of the lines are clearly shorter than 30 characters… When we used ↑ to convert our text to a matrix, APL padded the shorter lines with spaces so that all lines had the same length.

The easiest way to verify that we actually have trailing whitespace on the shorter lines is by using @ to replace them with something else:

'⎕'@(' '=⊢)text

To⎕APL,⎕or⎕not⎕to⎕APL,⎕that⎕is the⎕question:⎕whether⎕'tis⎕⎕⎕⎕ nobler⎕in⎕the⎕mind⎕to⎕suffer⎕⎕ the⎕slings⎕and⎕arrows⎕of⎕the⎕⎕ outrageous⎕fortune...⎕⎕⎕⎕⎕⎕⎕⎕⎕

You can read the line above as

“Put the character '⎕' at the positions where the character ' ' is equal to the characters in the text matrix.”

Working with a matrix of characters is already half of the work done, because it does simplify our code greatly.

The Matrix We Want to Build#

Of course this particular data format is only useful because we had an idea for how to solve this problem of justifying some text. Let us take another look at the matrix:

text

To APL, or not to APL, that is the question: whether 'tis nobler in the mind to suffer the slings and arrows of the outrageous fortune...

In order to justify this text, what we want to do is take the trailing spaces and spread them along the inner spaces of each line. In essence, we want to go from

⍝ To⎕APL,⎕or⎕not⎕to⎕APL,⎕that⎕is
⍝ the⎕question:⎕whether⎕'tis⎕⎕⎕⎕
⍝ nobler⎕in⎕the⎕mind⎕to⎕suffer⎕⎕
⍝ the⎕slings⎕and⎕arrows⎕of⎕the⎕⎕
⍝ outrageous⎕fortune...⎕⎕⎕⎕⎕⎕⎕⎕⎕

to

⍝ To⎕APL,⎕or⎕not⎕to⎕APL,⎕that⎕is
⍝ the⎕⎕⎕question:⎕⎕whether⎕⎕'tis
⍝ nobler⎕⎕in⎕⎕the⎕mind⎕to⎕suffer
⍝ the⎕⎕slings⎕⎕and⎕arrows⎕of⎕the
⍝ outrageous⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕fortune...

If we take a closer look at the second row of these two matrices, we notice that there were four trailing spaces that got divided as evenly as possible among the three inner spaces that existed: the first space (between “the” and “question”) is now 3 spaces wide and the two other spaces are 2 spaces wide:

⍝ the⎕question:⎕whether⎕'tis⎕⎕⎕⎕
⍝ the⎕⎕⎕question:⎕⎕whether⎕⎕'tis

In order to achieve this, what we will do is a build an integer matrix (with a shape that matches the shape of text) and each integer in that matrix will tell how many copies of each character we want in the final result.

For example, for the second row of text, we would like this integer row:

chars ← 'the question: whether ''tis    '
ints  ← 1 1 1 3 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 2 1 1 1 1 0 0 0 0
↑chars ints

t h e q u e s t i o n : w h e t h e r ' t i s 1 1 1 3 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 2 1 1 1 1 0 0 0 0

Each integer below each character tells how many copies of that character we want:

below letters and punctuation there is always a 1, because we want to preserve that character as-is;
below the trailing spaces we have a 0, because we want to remove that character from the line; and
below the inner spaces there are the numbers 3 and 2, because we want to create copies of those characters.

Without further ado, let’s start building this matrix. We will do it in three simpler steps.

The Characters to Keep#

Finding Spaces#

The first step is computing a Boolean matrix that will help us distinguish the trailing spaces from all other characters. Let us start by identifying the spaces in the text matrix:

' '=text

0 0 1 0 0 0 0 1 0 0 1 0 0 0 1 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 1 1 1 1 0 0 0 0 0 0 1 0 0 1 0 0 0 1 0 0 0 0 1 0 0 1 0 0 0 0 0 0 1 1 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1

Trailing Spaces#

What we want to do now is take the Boolean matrix above and extract another Boolean matrix that contains 1 for all characters in the text that we care about and that contains 0 in the positions of the trailing spaces.

To achieve this effect, we first use spaces to create a matrix that identifies the trailing spaces with 1s, and then we negate that matrix:

⌽∧\⌽' '=text

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1

There are three distinct symbols in the line above, so what do they do?

⌽ is the reverse function and the way it works is really simple:

⌽1 2 3 4

4 3 2 1

⌽text

si taht ,LPA ot ton ro ,LPA oT sit' rehtehw :noitseuq eht reffus ot dnim eht ni relbon eht fo sworra dna sgnils eht ...enutrof suoegartuo

The part that may require a deeper explanation is the ∧\ part. Start by noticing that ∧ is the Boolean and function:

0∧0

0

0∧1

0

1∧0

0

1∧1

1

(In APL we use 0 for “false” and 1 for “true”.)

Reduce and Scan#

Then \ is an operator: an APL glyph that takes a function and modifies it in some way. To understand what \ does, let us start by looking at the following:

2×3×4×5

120

×/2 3 4 5

120

10+45+23+1

79

+/10 45 23 1

79

Can you spot the pattern?

/ is the reduce operator: it takes a function on its left and puts it in between the elements of the vector on the right.

\ is the scan operator and works in a similar manner, but it gives you the intermediate results as well:

(2) (2×3) (2×3×4) (2×3×4×5)

2 6 24 120

×\2 3 4 5

2 6 24 120

Reducing and Scanning with `∧`#

f/ can often be read as a single unit:

+/ is “sum”
×/ is “product”
,/ is “join”
…

In particular, ∧/ is read as “all”, as it returns 1 only if all elements are 1:

∧/1 1 1

1

∧/1 1 0 1

0

Tied with ∧/ is ∧\ which returns the partial results:

∧\1 1 0 1

1 1 0 0

Because of the interpretation of ∧/ and the way \ scans the input vector, ∧\ takes a Boolean vector and returns a Boolean vector with the leading 1s preserved and everything else as 0s:

∧\1 1 1 1

1 1 1 1

∧\1 1 0 0 1 1

1 1 0 0 0 0

∧\0 1 1 1 1 1

0 0 0 0 0 0

You can read ∧\ as “were all 1 so far?”.

Reducing and Scanning a Matrix#

When you use f/ or f\ with a matrix, it is as if you did f/ or f\ on each row separately, so

+/' '=text

7 7 7 7 10

tells you how many spaces each line has, for example.

Now you should be able to better grasp what we did above: ⌽spaces puts the trailing spaces in the beginning and then ∧\ makes it so that only the leading 1s remain:

∧\⌽' '=text

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

When we reverse once more we put the 1s back in their place, therefore creating a Boolean matrix that tells us exactly where the trailing spaces are:

⌽∧\⌽' '=text

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1

We can negate this matrix, therefore creating a matrix that tells us what characters we want to keep and which characters (the trailing spaces) to discard:

⎕← keep ← ~⌽∧\⌽' '=text

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0

The function ~ above is the negate function and it turns 0s into 1s and 1s into 0s.

Inner Space Expansion#

Now that we have two Boolean matrices, one that tells the positions that contain spaces, and another that tells the positions that contain characters to keep, we can combine them to create a matrix with the positions of the inner spaces:

⎕← inner ← keep∧' '=text

0 0 1 0 0 0 0 1 0 0 1 0 0 0 1 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 1 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

We will now investigate, for each of the positions marked in this inner matrix, how many spaces should be added in those positions.

In order to figure that out, we first need to know how many trailing spaces each row has to distribute:

⎕← trail ← +/~keep

0 4 2 2 9

Just remember that +/ sums each row and ~keep has 1s where the trailing spaces are.

Similarly, we need to know how many positions exist that could receive trailing spaces, but this is just a matter of summing the inner matrix along the rows:

⎕← spaces ← +/inner

7 3 5 5 1

Let us put the trail and spaces vectors on top of each other:

↑trail spaces

0 4 2 2 9 7 3 5 5 1

Here is what this means:

the 1st row has 0 trailing spaces that need to be distributed among 7 positions;
the 2nd row has 4 trailing spaces that need to be distributed among 3 positions;
…
the 5th row has 9 trailing spaces that need to be distributed among 1 position.

If we divide trail by spaces we can figure out how many spaces to add to each position in each row:

trail÷spaces

0 1.33333 0.4 0.4 9

However, we cannot add 1.33333 spaces in a position, nor 0.4 spaces. We can only add whole characters, so let us round those values down:

⌊trail÷spaces

0 1 0 0 9

This shows how many spaces each position in each row can get, so let us encode that information in the inner matrix that we have:

inner

0 0 1 0 0 0 0 1 0 0 1 0 0 0 1 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 1 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

The inner matrix just says which positions can receive spaces, but we want a matrix that says how many spaces to add in each position.

One of the simplest ways to do that is by multiplying each row by the corresponding value in ⌊trail÷spaces. It would be great if we could write

inner × ⌊trail÷spaces

RANK ERROR: Mismatched left and right argument ranks
      inner×⌊trail÷spaces
           ∧

But unfortunately we cannot, so we need to tell APL that the × times function should operate on each row of inner and each element of the vector ⌊trail÷spaces, and we can do that with the rank operator ⍤:

⎕← add ← inner (×⍤1 0) ⌊trail÷spaces

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 9 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

First, note that in each row, the 1s have been replaced by the corresponding number in ⌊trail÷spaces.

The rank operator ⍤ is a really powerful operator but it might take some time to wrap your head around it. I can recommend Richard Park’s webinars on the rank operator.

Extra Spaces#

You might have noticed that the add matrix, that tells you how many spaces can be added to each position, does not account for all the trailing spaces that had to be distributed:

+/add

0 3 0 0 9

trail

0 4 2 2 9

That is because some of the rows were such that the number of inner spaces did not divide into the total number of trailing spaces that had to be distributed. We still need to distribute these spaces.

First of all, how many do we still have to distribute? You can compute it as

trail - +/add

0 1 2 2 0

But it can also be

spaces|trail

0 1 2 2 0

if you make use of the residue function | .

Enumerating the Inner Spaces#

Now that we know how many spaces are missing in each row, we need to figure out which of the inner positions receive those extra spaces. From the result of spaces|trail we see that

1st and 5th rows need no extra spaces;
2nd row needs 1 extra space; and
3rd and 4th rows need 2 extra spaces;

We wish to create a matrix, similar to inner, but that tells us what positions get these extra spaces. In order to get there, we start by scanning inner with +:

+\inner

0 0 1 1 1 1 1 2 2 2 3 3 3 3 4 4 4 5 5 5 5 5 6 6 6 6 6 7 7 7 0 0 0 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 0 0 0 0 0 0 1 1 1 2 2 2 2 3 3 3 3 3 4 4 4 5 5 5 5 5 5 5 5 5 0 0 0 1 1 1 1 1 1 1 2 2 2 2 3 3 3 3 3 3 3 4 4 4 5 5 5 5 5 5 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

Notice that this creates an interesting integer matrix, where the value in each row increases by 1 whenever we hit a 1 in the original inner matrix:

inner

0 0 1 0 0 0 0 1 0 0 1 0 0 0 1 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 1 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

If we multiply the two together, we get an enumeration of the marked inner positions:

inner × +\inner

0 0 1 0 0 0 0 2 0 0 3 0 0 0 4 0 0 5 0 0 0 0 6 0 0 0 0 7 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 2 0 0 0 3 0 0 0 0 4 0 0 5 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 2 0 0 0 3 0 0 0 0 0 0 4 0 0 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Now we can see which of these numbered positions need an extra space. We can do so by comparing the rows of that matrix with the values in spaces|trail. Again, we cannot do this directly:

(inner×+\inner) ≤ spaces|trail

RANK ERROR: Mismatched left and right argument ranks
      (inner×+\inner)≤spaces|trail
                     ∧

We need to use the rank operator ⍤ once more to pair the rows of inner×+\inner with the values of spaces|trail:

(inner×+\inner) (≤⍤1 0) spaces|trail

1 1 0 1 1 1 1 0 1 1 0 1 1 1 0 1 1 0 1 1 1 1 0 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 0 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 0 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

This result looks funny because our comparison also looked at all of the 0s in the inner matrix, although we don’t care about those. In order to fix it, we need to multiply by inner again:

inner × (inner×+\inner) (≤⍤1 0) spaces|trail

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Now this shows the positions that will receive one extra space. Let us save it in a variable:

extra ← inner × (inner×+\inner) (≤⍤1 0) spaces|trail

Notice that we could multiply by inner just once, in the end, instead of twice:

inner × (+\inner) (≤⍤1 0) spaces|trail

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Putting Everything Together#

We have our three matrices keep, add, and extra, so we can build the matrix we set out to build:

keep+add+extra

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 3 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 2 1 1 1 1 0 0 0 0 1 1 1 1 1 1 2 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 1 1 2 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 1 1 10 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0

We just have to use it to replicate the original characters by the correct amount:

text

To APL, or not to APL, that is the question: whether 'tis nobler in the mind to suffer the slings and arrows of the outrageous fortune...

Luckily, APL has a replicate function, which is also in the symbol /. For example:

1 0 2 0 3 0 / 'DYALOG'

DAAOOO

Unfortunately, / does not work as-is with two matrices:

(keep+add+extra)/text

RANK ERROR
      (keep+add+extra)/text
                      ∧

We need to flatten the two matrices with ,, and then use ⍴ to reshape the result back to its original shape:

(⍴text)⍴(,keep+add+extra)/,text

To APL, or not to APL, that is the question: whether 'tis nobler in the mind to suffer the slings and arrows of the outrageous fortune...

Writing a Dfn#

If we take all our code and put it in a (single-line) dfn, here is what we get:

Just ← { keep ← ~⌽∧\⌽' '=⍵ ⋄ inner ← keep∧' '=⍵ ⋄ trail ← +/~keep ⋄ spaces ← +/inner ⋄ add ← inner(×⍤1 0)⌊trail÷spaces ⋄ extra ← inner×(+\inner)(≤⍤1 0)spaces|trail ⋄ (⍴⍵)⍴(,keep+add+extra)/,⍵ }

Just text

To APL, or not to APL, that is the question: whether 'tis nobler in the mind to suffer the slings and arrows of the outrageous fortune...

(Notice that, for one, this implementation may not be the best solution to this task, but was written in this way because of the context in which it was presented. Secondly, this dfn should probably be defined across multiple lines, but current limitations of the TryAPL site mean we can only define one-line dfns.)

A More Robust Function#

While our code works quite well, it has a very serious shortcoming: it does not work on text with empty lines…

text ← ↑'This is the first line.' '' 'This is another paragraph.'
text

This is the first line. This is another paragraph.

Just text

DOMAIN ERROR: Divide by zero
Just[0] Just←{keep←~⌽∧\⌽' '=⍵ ⋄ inner←keep∧' '=⍵ ⋄ trail←+/~keep ⋄ spaces←+/inner ⋄ add←inner(×⍤1 0)⌊trail÷spaces ⋄ extra←inner×(+\inner)(≤⍤1 0)spaces|trail ⋄ (⍴⍵)⍴(,keep+add+extra)/,⍵}
                                                                                                          ∧

Gladly, we can fix this without too much trouble. We can just add some pre-processing to ignore blank lines or lines that do not need justifying (lines that have only 1 word or lines that are too short) and reuse our Just function.

(This was shown in the APL Seeds presentation very briefly, with no explanation, as it was used to justify a text with approximately 12.500 lines (a text file of Jules Verne’s “Twenty Thousand Leagues Under the Sea”, downloaded from Project Gutenberg.)

BetterJust ← { s ← ' '=⍵ ⋄ t ← ⌽∧\⌽s ⋄ fewWs ← 0=+/s-t ⋄ shortL ← (+/t)>0.25×⊃⌽⍴⍵ ⋄ use ← ~fewWs∨shortL ⋄ result ← ⍵ ⋄ (use⌿result) ← Just use⌿⍵ ⋄ result }

BetterJust text

This is the first line. This is another paragraph.

We can also take our original text, add a header and a blank line for demonstration purposes, and justify it again:

⎕← text ← ↑'HAMLET' '' 'To APL, or not to APL, that is' 'the question: whether ''tis' 'nobler in the mind to suffer' 'the slings and arrows of the' 'outrageous fortune...'

HAMLET To APL, or not to APL, that is the question: whether 'tis nobler in the mind to suffer the slings and arrows of the outrageous fortune...

BetterJust text

HAMLET To APL, or not to APL, that is the question: whether 'tis nobler in the mind to suffer the slings and arrows of the outrageous fortune...

Notice that, now, the last line doesn’t get justified because it is “too short”. What “too short” means is controlled by a parameter in the BetterJust function, which is set to 0.25 right now (meaning lines that have 25% or more of trailing whitespace are not justified).

Justifying Text

Contents

Justifying Text#

The Text We’ll Justify#

Data Format#

The Matrix We Want to Build#

The Characters to Keep#

Finding Spaces#

Trailing Spaces#

Reduce and Scan#

Reducing and Scanning with ∧#

Reducing and Scanning a Matrix#

Inner Space Expansion#

Extra Spaces#

Enumerating the Inner Spaces#

Putting Everything Together#

Writing a Dfn#

A More Robust Function#

Reducing and Scanning with `∧`#