The Perfect Trap to Catch a Newline

Hah, I need to write this down before I forget it again... One of the files I was editing was created on Windows: All the html was displayed in one long line with lots of \^M's instead of newlines -- pretty annoying when you just want to do some quick fixes with a simple linux texteditor.

The easiest way in Linux to replace a character (set) by another is the shell's tr command. If you get an error from tr, note that it expects input from STDIN, don't give it a file name as argument. And make sure the output file has a different name than the input file, otherwise the file will go all Ouroboros on you and eat itself. Usually, you give tr two arguments, the character to replace and the replacement.

But to search for a special character like \^M, you can't just search for a \^ and an M, your need heavier ordnance. The \^ thingy is shaped like a v, that was what finally reminded me how to do it. You press ctrl-v and then hit return. It will come out as \^M. Don't ask me why, it's magic. It also works with tab and other stuff. So the following line will replace all alien \^M characters by good and friendly \\n characters:

tr '\^M' '\\n' < file.html > file2.html # press ctr-v return for \^M

The wikipedia entry on newlines claims I could have searched for \\r instead of \^M, or used dos2unix if installed -- but oh well, next time. My solution at least escapes any character I want, not just "return", and it's always installed.

What about other options? In emacs, I'd press esc shift-5 for the search&replace "dialog", and when I replace some crazy special character, I usually just cowardly select it with the mouse while no-one's looking, and paste it into the dialog by pressing the middle mousebutton. Yes, generally this also works for newlines -- you select from after the last character of a line to before the first character of the next line, and it will select the (invisible) newline character. (Copying&pasting newline characters works in MacOS too, only in Windows I never succeeded with this trick.) Unfortunately, emacs somehow couldn't generate a "normal" newline character as replacement in this context. And replacing \^M by \\n in vi only gave me even crazier \^@ thingies. Hm, seems I'll stick with the shell then.

Unless anyone has a better solution. Feel free to leave a comment.

Comments:

Actually, the end-of-line on Windows is a carriage return and a linefeed (aka newline, aka \\n), so replacing the carriage returns (aka \^M, aka \\r) with newlines will double up the line endings.

My old standby for Windows-to-Unix conversions is:

$ perl -pe 's/\\r//g' < infile > outfile

Posted by Clayton Wheeler on August 09, 2006 at 01:11 PM CEST #

I know the case you mean, but all the text was on one line, there was no extra newline next to the \^M, it was only one character. I don't know which of the two cases is more common, seems we have to be aware of both possiblities when converting.

Posted by Seapegasus on August 14, 2006 at 05:08 AM CEST #

Post a Comment:
Comments are closed for this entry.
About

NetBeans IDE, Java SE and ME, 3D Games, Linux, Mac, Cocoa, Prague, Linguistics.

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today
News

No bookmarks in folder

Blogroll