Working with Windows and Unix

Beware new line characters

One of the most frequent issues we encounter in Tech Support is the corruption of files that are transferred between Windows and Unix.   The transfer can occur at any stage, but ultimately involves a transfer of a file using an ftp client that is running on Windows;  it could be ftp or filezilla.

Windows uses two characters to mark the end of a line in a text file (CR/LF),
carriage return, linefeed.   Unix uses a single character (CR).

In all situations, it is best to use binary mode transfer for all files, including ascii text files.

Common problems:

  • upload a core file from unix to windows using ftp in ascii mode.
    The file is going to be larger on Windows than Unix.
    ftp doesn't know if this is a text file with real line-ends, it takes every ascii CR and transmits two ascii characters CR/LF.
    The core file, tar file, library ... will be corrupted when transferred to Oracle.
  • download a shell script to Windows, and transfer it to Unix using ftp
    If the file is edited on Windows, the unix script line-end chars will be doubled.
    Unix doesn't know how to handle that, and will likely tell you the script is not executable.
    Why?  The first line of a shell script ( called "sh-bang" ), identifies the command interpreter the unix shell should use for this script.   Common examples:
    #/bin/sh
    #/bin/ksh
    #/bin/bash
    #/bin/perl

    #/bin/sh^M    # will not be understood.
    #/bin/env ksh # special syntax.  Find ksh and run it

dos2unix is a common utility found on most unix platforms, that repairs the issue of Windows LineEnd characters in unix script files.   I've written my own flavor of this utility for use in Tech Support and build environments, that is a bit easier to use, and has some nice side-effects.

  • accepts a list of files:   dos2unix *.sh
  • repairs the file in-place.  Doesn't generate a new file you have to name
  • retains the same timestamp;  it is the encoding that changed, not the file content.

Here are the versions of dos2unix for each of the environments we work in.
They are compressed with gzip, to avoid the ftp ascii transfer trap,
and because I am limited by the number and size of files for this blog.

Comments:

Post a Comment:
Comments are closed for this entry.
About

Dick Dunbar
is an escalation engineer working in the Customer Engineering & Advocacy Lab (CEAL team)
for Oracle Analytics and Performance Management.
I live and work in Santa Cruz, California.
I'll share the techniques I use to detect, avoid and repair problems.

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today