What is a hard return?
A "Hard Return" is defined (by me at least) as a Line Ending character (or character pair) which is employed to mark the end of a line of text as well as marking the end of a paragraph. To find out more about special characters in text files read here.
Why they are bad
This problem is frequently seen in email (especially in replies which quote the original text). The problem with hard line endings is that assumptions are made about the font size and other visible characteristics of the reader. Instead of letting the reader decide when to wrap a line of text to the next line, hard line ends force the decision.
For example, here is a copy of the previous paragraph, wrapped at 64 characters, displayed in a relatively wide window (hard returns are indicated by ""):
This problem is frequently seen in email (especially in replies
which quote the original text). The problem with hard line
endings is that assumptions are made about the font size and
other visible characteristics of the reader. Instead of letting
the reader decide when to wrap a line of text to the next line,
hard line ends force the decision.
As you can see, space is being wasted on the right hand side, because hard returns (after the words "replies", "line", "and" etc) are forcing the text to wrap to the next line.
Here is the same paragraph, this time displayed in a narrow window:
This problem is frequently seen in email (especially in replies
which quote the original text). The problem with hard line
endings is that assumptions are made about the font size and
other visible characteristics of the reader. Instead of letting
the reader decide when to wrap a line of text to the next line,
hard line ends force the decision.
You can see that the first line is forced to wrap because of the confines of the window, but then the text wraps again, because of the hard return after "replies", even though it doesn't need to. This situation is even worse than the previous example, since it confuses the reader by making the text look like a bunch of ugly short paragraphs.
Why they are used
We have these problems mainly because of the evolution from old technology to new. In the old days before powerful computers (before they even had monitors), computers would respond to user input with printed output (spitting out text on a physical printer). Now these printers were pretty dumb, in that they didn't do much thinking, so one of the things you had to do was to tell them when to start a new line (just like a typist had to decide when to start a new line... a manual typewriter could not 'know' how long the word you were typing would be, so it was up to the typist to make sure there was enough room left on the current line.) These days it is trivial for a computer to decide "on the fly" how to display or print text... there is ABSOLUTELY NO NEED for a person to add hard returns to the text, because they cant know how an end user might be displaying it!
Getting rid of them
So, what about automatic processing to remove unwanted hard returns? Jujusoft Reader attempts to strip hard returns (it guesses which lines are wrapped to end a paragraph, and which are wrapped to limit the line width). The problem is, this is not always obvious, as in the following example:
This problem is frequently seen in email. And
now some poetry,
This is the first line,
and this is the second,
So how does a computer tell
How do we know which lines to unwrap then?
That was my illustrative poem.
Here the only hard return which should be removed is the first one (after "And"). All the others should stay. How does a piece of software know this though, without the human benefits of literacy and intuition to guide it? It doesn't! That's the whole problem! There will always be this kind of ambiguity when people use what is essentially a paragraph mark to limit the number of characters on a line.
Saturday, October 19, 2002