Some more Perl for coaching

We needed to send an email to every player under the age of 13. All players might not have email addresses or might have multiple ones. I took the League Organizer database and exported all of the players into a comma-separated file (CSV). This also put the column names in the file.

Here is a sample of the file:

fname,lname,mname,street,street2,town,state,zip,...,cust5data,cust6data,fvolunteer,gvolunteer
"Alpo","Food","The","199 East","","Tulsa","OK","74133",...,0,0,"",""

The first thing I did was add an '!' on the first line. I did that because I have an old perl script which dynamically matches variable names supplied on the first line (denoted with a '!') with the values of the n'th line. The helper function is:

package read_txtfile_format;

sub main'read_txtfile_format {
        local(\*file,\*format) = @_;
        local($first_line, $first_char) = '';

        do {
                $first_line = <file>;
                $first_line =~ /(.)(.\*)/;

                $first_char = $1;
                $first_line = $2;
        } until ($first_char eq "!" || eof(file));

        if (eof(file)) {
                die "There is no ! header line in $file";
        }

        $format = '$' . join(', $', split(/,/, $first_line));
}

The first pass was to make sure I could read the file and get the fields I wanted:

#! /usr/bin/perl

$, = ' ';        # set output field separator
$\\ = "\\n";       # set output record separator

$FS = "\\t";

do 'getthead.pl';

open(LNG_FILE, $ARGV[0]) || die "Can't open LNG_FILE: $!\\n";

# Determine the Column Names
do main'read_txtfile_format(\*LNG_FILE, \*languages); #' Hack to get color correct in vim...

lang: while (<LNG_FILE>) {
        next lang if (/\^#/ || /\^!/);
        eval "($languages) = split( /[,\\n]/ )";

        print "$fname $lname $age |$email|$memail|$femail|";
}

close LANG_FILE;

exit(0);

First we read the input file to get the column names made into variable names. Then each time we process a line, we dynamically assign the different fields to the named variables. I can then use $fname to refer to the first name in that line.

I surround the three different emails with a symbol such that I can detect empty strings:

"FRED" "KIBBLES" 11 |"fred.kibbles@..."||""|
"EDITH" "BITS" 14 |""||"edeath.bits"|

The first problem was why was there nothing in the 2nd field? Turns out I used $memail instead of $gemail. I fixed that and got:

"FRED" "KIBBLES" |"fred.kibbles@..."|""|""|
"EDITH" "BITS" 14 |""|""|"edeath.bits"|

The next step was to eliminate players who were too old:

        if ($age < 13) {
                print "$fname $lname $age |$email|$memail|$femail|";
        }

And that yields:

"FRED" "KIBBLES" |"fred.kibbles@..."|""|""|

I could have a case where the player had no email addresses to use. I wanted to do that next. I also wanted to only print out the email and sort it. Adding an associative array solves that:

        if ($age < 13) {
                if ($email ne "") {
                        $emTo{$email} = $email;
                }
                if ($gemail ne "") {
                        $emTo{$gemail} = $gemail;
                }
                if ($femail ne "") {
                        $emTo{$femail} = $femail;
                }
        }
}

foreach $key (sort(keys(%emTo))) {
        print "$key";
}

And I got 3 entries per line still. This threw me for a bit, a long while, until I realized that when I saw the output, the '"' were actually input from the files. The comparison was testing for blank email addresses and not ones of the form "". I had to modify the test to:

        if ($age < 13) {
                if ($email ne "\\"\\"") {
                        $emTo{$email} = $email;
                }
        }

Of course, the output was still of the format:

"fred.kibbles@..."

But this was dead simple to eliminate, I just had to do some regexping:

        if ($age < 13) {
                if ($email ne "\\"\\"") {
                        $emTo{$email} =~ $email = /"(.\*)"/;
                }
        }

I got a parsing error. The =~ was in the wrong part. This should work:

        if ($age < 13) {
                if ($email ne "\\"\\"") {
                        $emTo{$email} = $email =~ /"(.\*)"/;
                }
        }

Nope, still had output of the form:

"fred.kibbles@..."

The problem here is that it didn't matter what I was stuffing into the array, the key was always the orginal string:

$emTo{"fred.kibbles@..."} <- fred.kibbles@...

I'm kinda right, but the value being stored turned out to be 1, not the contents. More on that, the fix for this problem, and making it generic was:

                        $key = $email =~ /"(.\*)"/;
                        if ($key =~ /@/) {
                                $emTo{$key} = $key;
                        }

We can see I'm removing invalid addresses and I can change just one line to make this work for any email field. This should run sweetly. But it didn't. If I added a print debug line:

                        print "$key";

All I got was:

1
1
1
1

I struggled with this one for a while. I had similar code in another script:

        my($base) = /(.\*)\\.txt/;

This stripped out the base filename from a list passed into the standard input. I'm normally very good about declaring all of my variables before use, but with the dynamic nature of this script, I wasn't doing that. The one thing which stuck out in my mind was that '1' was probably an array subscript. So I finally tried this before I threw up my hands:

                        ($key) = $email =~ /"(.\*)"/;
                        if ($key =~ /@/) {
                                $emTo{$key} = $key;
                        }

I.e., I made an array of one element and assigned the results there. This worked!

The final version of the script was:

#! /usr/bin/perl

$, = ' ';        # set output field separator
$\\ = "\\n";       # set output record separator

$FS = "\\t";

do 'getthead.pl';

open(LNG_FILE, $ARGV[0]) || die "Can't open LNG_FILE: $!\\n";

# Determine the Column Names
do main'read_txtfile_format( \*LNG_FILE, \*languages ); #' Hack to get color correct in vim...

lang: while (<LNG_FILE>) {
        next lang if (/\^#/ || /\^!/);
        eval "($languages) = split( /[,\\n]/ )";

        if ($age < 13) {
                if ($email ne "\\"\\"") {
                        ($key) = $email =~ /"(.\*)"/;
                        if ($key =~ /@/) {
                                $emTo{$key} = $key;
                        }
                }
                if ($gemail ne "\\"\\"") {
                        ($key) = $gemail =~ /"(.\*)"/;
                        if ($key =~ /@/) {
                                $emTo{$key} = $key;
                        }
                }
                if ($femail ne "\\"\\"") {
                        ($key) = $femail =~ /"(.\*)"/;
                        if ($key =~ /@/) {
                                $emTo{$key} = $key;
                        }
                }
        }
}

close LANG_FILE;

foreach $key (sort(keys(%emTo))) {
        print "$key";
}

exit(0);

I could tidy it up, make a function to abstract parsing an email address. But why? This works.

It turns out that 364 people will get an email invitation to join the Tulsa Nationals Player Development Program:

[usc@adept ~]$ ./mlist.pl people.txt | wc -l
364

Now to create a mailing list and to make sure I BCC everyone - I don't want to be giving spammers ammunition.


Technorati Tags:
Orginally posted on Kool Aid Served Daily
Copyright (C) 2006, Kool Aid Served Daily
Comments:

Post a Comment:
  • HTML Syntax: NOT allowed
About

tdh

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today