Shell Tips - how to cut substring?

Although a awk has some string function such as substr, I write a sh function to cut sub string which you specific.

Usage: strcut $String $SubCut

Example:

strcut "Goodshell" "shell"

It will print "Good"

Code:

#!/bin/sh

strcut ()
{
strSrc=$1
strSub=$2

if [ $# != 2 ]; then
    #echo "Argument error!"; 1>&2
    return 1
fi
   
newStr=`echo $strSrc |  awk '{if ($0~/'$strSub'$/) { len=length($0)-length(Sub); print substr($0, 1, len)} }'  Sub=$strSub`

if [ "$newStr" = "" ]; then
    #echo "Cannot cut"; 1>&2
    return 1
else
    echo $newStr
    return 0
fi
}

strcut $1 $2

 

Comments:

If using ksh:
#! /bin/ksh

strcut() {
        echo "${1%$2}"
}

strcut "Use Korn shell instead" " instead"

Posted by Mike Gerdts on July 03, 2007 at 01:17 PM CST #

good.,

Posted by se on July 03, 2007 at 01:50 PM CST #

Good! But does anyone know better way using sh? sed and cut seem also available .

Posted by xue on July 03, 2007 at 02:25 PM CST #

Yes, there is a way in Bourne Shell (sh) using built-ins only, but why bother when better shells are around? That way would be to use a while loop with a case statement where each case matches against <character>\* and accumulate that character until you find that the accumulated string concatenated with the strSub argument equals strSrc or you run out of characters.

Posted by Nico on July 03, 2007 at 06:57 PM CST #

sub () {
    orig=$1
    suffix=$2

    case "$orig" in
        \*${suffix}) :;;
        \*) echo "Error: $2 not a suffix of $1" 1>&amp;2; return 1;;
    esac

    prefix=
    finished=false

    while :
    do
        case "$orig" in
            ${prefix}${suffix}) finished=:; break;;
            ${prefix}a\*${suffix}) prefix=${prefix}a;;
            # ...
            ${prefix}z\*${suffix}) prefix=${prefix}z;;
            \*) break;;
        esac
    done

    if $finished
    then
        echo "PREFIX=$prefix"
        return 0
    fi

    echo "Error: something went wrong -- probably not enough arms in the case statement" 1>&amp;2
    return 1
}

Posted by Nico on July 03, 2007 at 08:48 PM CST #

Actually, my version uses one non-built-in: /bin/false. I think that's OK :), but you could always fix it so it doesn't.

The need for a case/esac arm for every character you could match makes this function very long, but it does show what you can do with Bourne Shell glob matching and the case statement (er, actually, it's a "command" according to the man page, though built-in)!

BTW, my knowledge of what can be done with bare-bones Bourne Shell comes from the old SunOS 4.x days when separate / and /usr partitions were de rigeur -- add ODS and often there'd be problems where the OS would stop booting and drop into single-user mode with / mounted read-only and /usr not mounted at all. Knowing the ins and outs of Bourne Shell programming in such situations was very useful! Someone (or was it a magazine article? I forget) showed me how to cat files with sh but without cat and I never looked at shell scripting the same way again.

Posted by Nico on July 03, 2007 at 09:14 PM CST #

One more thing: there were no sed, cut, awk, cat, ls, nor many other commands in SunOS when /usr was not mounted.

Posted by Nico on July 03, 2007 at 09:19 PM CST #

Nico, thanks for comment very much! It's helpful for me.:)

Posted by xue on July 04, 2007 at 03:36 AM CST #

AFAIK both solutions (using shell builtin commands and awk/sed filters) are correct, however keep following rule in mind:
- Only use external filters if you have _lots_ of data (e.g. at least more than one CPU page (e.g. /usr/bin/pagesize)) of data to pass through the filters, otherwise the |fork()|+|exec()| overhead costs too much CPU time.

See also the draft for the shell style guide section Only use external filters like grep/sed/awk/etc. if you want to process lots of data with them.

Posted by Roland Mainz on July 22, 2007 at 04:56 AM CST #

Post a Comment:
Comments are closed for this entry.
About

williamxue

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today