Yet more Perl for coaching

I'm about to start scheduling the games for next season. One thing I want to do is to make sure that if a coach has multiple kids, I don't schedule them all at the same time. So, I need to find out which people have multiple kids. The database will tell me that.

The first pass is to find out the information I need:

#! /usr/bin/perl
do 'getthead.pl';

open(LNG_FILE, $ARGV[0]) || die "Can't open LNG_FILE: $!\\n";

# Determine the Column Names
do main'read_txtfile_format(\*LNG_FILE, \*languages); #' Hack to get color correct in vim...

lang: while (<LNG_FILE>) {
        next lang if (/\^#/ || /\^!/);
        eval "($languages) = split(/[,\\n]/)";

        if ($age < 20) {
                print "$fname $lname $position |$father $flname| |$mother $glname|\\n";
        } else {
                print "COACH: $fname $lname $position |$father $flname| |$mother $glname|\\n";
        }
}

close LANG_FILE;

exit(0);

Quickly, I'm guessing all coaches will be over 20. I know that won't work for all of them - but it is good enough for now. Here is some sample output:

[usc@adept ~]$ ./scheds.pl p.txt
COACH: "TOMPER" "HAYNES" "" | ""| |"" ""|
"MONSTER" "HAYNES" "" | "HAYNES"| |"MOMA" "The HAYNES"|

Okay, I want those " gone:

sub stripQuotes {
        my($raw) = @_;

        ($raw) = $raw =~ /"(.\*)"/;
        $raw;
}

        $fname = stripQuotes($fname);
        $lname = stripQuotes($lname);
        $father = stripQuotes($lname);

        if ($age < 20) {
                print "$fname $lname $position |$father $flname| |$mother $glname|\\n";
        } else {
                print "COACH: $fname $lname $position |$father $flname| |$mother $glname|\\n";
        }

Which gives:

[usc@adept ~]$ ./scheds.pl p.txt
COACH: TOMPER HAYNES "" | ""| |"" ""|
MONSTER HAYNES "" | "HAYNES"| |"MOMA" "The HAYNES"|

A function is great, but invoking it every time is going to get old. It would be best if this was done automatically during the eval. Hmm, lets look at some raw data:

..."F06|S07|   |","X",F,/  /,/  /,F,F,05/17/2005,49218,/  /,0...

For any given field, it might have a " or it may not. I don't want to hardcode this into the eval. I want to be able to use this on any standard CSV file out there. I could go look for a library, but where is the fun in that I ask you?

Lets look at the key lines:

do 'getthead.pl';

do main'read_txtfile_format(\*LNG_FILE, \*languages); #' Hack to get color correct in vim...

lang: while (<LNG_FILE>) {
        next lang if (/\^#/ || /\^!/);
        eval "($languages) = split(/[,\\n]/)";
}

What type of data structure is $languages? It has to contain the names of the fields. Let's see what happens if we try this in the loop:

        print "$languages\\n";
        exit(0);

We get:

[usc@adept ~]$ ./scheds.pl p.txt
$fname, $lname, $mname, $street, ... , $fvolunteer, $gvolunteer

We can either work with ($languages) and another eval after the statement OR make the eval more complex. Let's try mucking with the eval statement first:

sub stripArray {
        my(@ar) = @_;

        for ($i = 0; $i < @ar; $i++) {
                ($ar[i]) = $ar[i] =~ /"(.\*)"/;
        }

        @ar;
}

        eval "($languages) = stripArray(split(/[,\\n]/))";

Which yields:

[usc@adept ~]$ ./scheds.pl p.txt
COACH:  "HAYNES" "" |"" ""| |"" ""|
 "HAYNES" "" |"TOMPER" "HAYNES"| |"MOMA" "The HAYNES"|

For grins, lets take a quick diversion and see what happens with a second eval:

        eval "($languages) = split(/[,\\n]/)";
        eval "@ar = ($languages); for ($i = 0; $i < @ar; $i++) { ($ar[i]) = $ar[i] =~ /\\"(.\*)\\"/;}";

And it too fails:

[usc@adept ~]$ ./s.pl p.txt
COACH: "TOMPER" "HAYNES" "" |"" ""| |"" ""|
"MONSTER" "HAYNES" "" |"TOMPER" "HAYNES"| |"MOMA" "The HAYNES"|

Ack, I have $ar[i] instead of $ar[$i]. Fixing it does not matter. Double frak! I did the same thing with the first example! Let's try it again:

sub stripArray {
        my(@ar) = @_;

        for ($i = 0; $i < @ar; $i++) {
                ($ar[$i]) = $ar[$i] =~ /"(.\*)"/;
        }

        @ar;
}

        eval "($languages) = stripArray(split(/[,\\n]/))";

And it yields:

TOMPER HAYNES  | | | |
MONSTER HAYNES  |TOMPER HAYNES| |MOMA The HAYNES|

It is harder to detect NULLS, but it will be much easier to work with the data. But, I've got another bug. I decided to check and see what would happen if I tried to print a field which did not have quotes to start with:

        if ($age < 20) {
                print "$fname $lname $age $position |$father $flname| |$mother $glname|\\n";
        } else {
                print "COACH: $fname $age $lname $position |$father $flname| |$mother $glname|\\n";
        }

Which gives:

[usc@adept ~]$ ./scheds.pl p.txt
TOMPER HAYNES   | | | |
MONSTER HAYNES   |TOMPER HAYNES| |MOMA The HAYNES|

I had this working last night and the fix is pretty simple:

sub stripArray {
        my(@ar) = @_;

        for ($i = 0; $i < @ar; $i++) {
                if ($ar[$i] =~ /\^"(.\*)"$/) {
                        ($ar[$i]) = $ar[$i] =~ /"(.\*)"/;
                }
        }

        @ar;
}

I.e., only change it if it has a start and end quote:

[usc@adept ~]$ ./scheds.pl p.txt
COACH: TOMPER 29 HAYNES  | | | |
MONSTER HAYNES 9  |TOMPER HAYNES| |MOMA The HAYNES|

No, I'm not 29, but I don't want to give away too much info. ;)

Now I want to detect conflicts, which means storing lines and then going back over them. I noticed that coaches do not have fathers and mothers. That can be used to pull them out. Here is the first pass at detecting conflicts:

my(@arSibs);

lang: while (<LNG_FILE>) {
        next lang if (/\^#/ || /\^!/);
        eval "($languages) = stripArray(split(/[,\\n]/))";

        if (($father ne "" && $flname ne "") || ($mother ne "" && $glname ne "")) {
                if ($sex eq "M") {
                        $i = "B";
                } else {
                        $i = "G";
                }

                my(%curr) = (
                        'name' => $fname,
                        'last' => $lname,
                        'agegroup' => "U" . $age . $i,
                        'ffirst' => $father,
                        'flast' => $flname,
                        'mfirst' => $mother,
                        'mlast' => $glname);

                push(@arSibs, \\%curr);
        }
}

close LANG_FILE;

sub sortByLast {
        $$a{last} cmp $$b{last};
}

my(@ar) = sort(sortByLast @arSibs);

for ($i = 0; $i < @ar; $i++) {
        $match = 0;

        for ($j = $i + 1; $j < @ar; $j++) {
                if ((($ar[$i]{ffirst} eq $ar[$j]{ffirst}) &&
                                ($ar[$i]{flast} eq $ar[$j]{flast})) ||
                                (($ar[$i]{mfirst} eq $ar[$j]{mfirst}) &&
                                ($ar[$i]{mlast} eq $ar[$j]{mlast}))) {
                        if ($match == 0) {
                                $match = 1;
                                print $ar[$i]{name} . " " . $ar[$i]{last} . " " . $ar[$i]{agegroup} .
                                        " has conflicts:\\n";
                        }

                        print $ar[$j]{name} . " " . $ar[$j]{last} . " " . $ar[$j]{agegroup} .  "\\n";
                }
        }
}

We create an array of structures and do a single pass to detect if there is another entry with a common parent. Note that it is okay for the fathers to be the same for one child, but the mothers to be different - stuff happens. A copy of the Monster and a quick test yields:

[usc@adept ~]$ ./scheds.pl p.txt
MONSTER HAYNES U9B has conflicts:
MONSTER THE U9B

Look what happens as we add a sister:

[usc@adept ~]$ ./scheds.pl p.txt
MONSTER HAYNES U9B has conflicts:
MONSTER LITTLE U9G
MONSTER THE U9B
MONSTER LITTLE U9G has conflicts:
MONSTER THE U9B

We already know that THE and LITTLE have a conflict. What we need to do is record when a parent has been scanned. Note that since we only have the parent's names, we cannot tell if TOMPER HAYNES is unique. If we adjust the script as:

                my(%curr) = (
                        'name' => $fname,
                        'last' => $lname,
                        'agegroup' => "U" . $age . $i,
                        'ffirst' => $father,
                        'flast' => $flname,
                        'fmatch' => 0,
                        'mfirst' => $mother,
                        'mlast' => $glname,
                        'mmatch' => 0);


for ($i = 0; $i < @ar; $i++) {
        $match = 0;

        for ($j = $i + 1; $j < @ar; $j++) {
                $fmatch = ($ar[$i]{ffirst} eq $ar[$j]{ffirst}) &&
                                ($ar[$i]{flast} eq $ar[$j]{flast}) &&
                                ($ar[$j]{fmatch} == 0);
                if ($fmatch) {
                        $ar[$j]{fmatch} = 1;
                }

                $mmatch = ($ar[$i]{mfirst} eq $ar[$j]{mfirst}) &&
                                ($ar[$i]{mlast} eq $ar[$j]{mlast}) &&
                                ($ar[$j]{mmatch} == 0);

                if ($mmatch) {
                        $ar[$j]{mmatch} = 1;
                }

                if ($fmatch || $mmatch) {
                        if ($match == 0) {
                                $match = 1;
                                print $ar[$i]{name} . " " . $ar[$i]{last} . " " . $ar[$i]{agegroup} .
                                        " has conflicts:\\n";
                        }

                        print $ar[$j]{name} . " " . $ar[$j]{last} . " " . $ar[$j]{agegroup} .  "\\n";
                }
        }
}

As we find a matching parent, we mark them off. Since we have to check the match twice (once to report and once to record), we use a local variable. Note that the corresponding variable is only set if the parent has not been recorded.

When we try it on the small dataset, we get:

[usc@adept ~]$ ./scheds.pl p.txt
MONSTER HAYNES U9B has conflicts:
MONSTER LITTLE U9G
MONSTER THE U9B

One of the things I'm trying to push here is that you have to unit test. What happens if my worst nightmare was to come true and I had a child whose mother was not my wife. And to make the resulting battle fair, she had one whose father was not me?

It actually depends on how they are entered. If my Monster is first:

[usc@adept ~]$ ./scheds.pl p.txt
MONSTER HAYNES U9B has conflicts:
MONSTER THE U9B
MONSTER LITTLE U9G
MONSTER CRINGE U9G
MONSTER CLONE U9G

But if one of the non-aligned ones is first, we could get:

[usc@adept ~]$ ./scheds.pl p.txt
MONSTER CLONE U9G has conflicts:
MONSTER HAYNES U9B
MONSTER LITTLE U9G
MONSTER THE U9B
MONSTER CRINGE U9G has conflicts:
MONSTER HAYNES U9B
MONSTER LITTLE U9G
MONSTER THE U9B

The point is that unless we invest a lot of time into getting all of the permutations correct, we are going to gloss over some information. We ought to print out which parent is causing the conflict:

                        print "\\t" . $ar[$j]{name} . " " . $ar[$j]{last} . " " . $ar[$j]{agegroup};

                        if ($mmatch) {
                                print "(" . $ar[$i]{mfirst} . " " . $ar[$i]{mlast} . ")";
                        }

                        if ($fmatch) {
                                print "(" . $ar[$i]{ffirst} . " " . $ar[$i]{flast} . ")";
                        }

                        print "\\n";

Which would show us:

[usc@adept ~]$ ./scheds.pl p.txt
MONSTER CLONE U9G has conflicts:
        MONSTER HAYNES U9B(MOMA The HAYNES)
        MONSTER LITTLE U9G(MOMA The HAYNES)
        MONSTER THE U9B(MOMA The HAYNES)
MONSTER CRINGE U9G has conflicts:
        MONSTER HAYNES U9B(TOMPER HAYNES)
        MONSTER LITTLE U9G(TOMPER HAYNES)
        MONSTER THE U9B(TOMPER HAYNES)

Or

[usc@adept ~]$ ./scheds.pl p.txt
MONSTER HAYNES U9B has conflicts:
        MONSTER THE U9B(MOMA The HAYNES)(TOMPER HAYNES)
        MONSTER LITTLE U9G(MOMA The HAYNES)(TOMPER HAYNES)
        MONSTER CRINGE U9G(TOMPER HAYNES)
        MONSTER CLONE U9G(MOMA The HAYNES)

Note how a simple boolean gets swamped (we would expect both parents in the first example). It might be better to remove the boolean and try to do some final coorelation by eye:

[usc@adept ~]$ ./scheds.pl p.txt
MONSTER CLONE U9G has conflicts:
        MONSTER HAYNES U9B(MOMA The HAYNES)
        MONSTER LITTLE U9G(MOMA The HAYNES)
        MONSTER THE U9B(MOMA The HAYNES)
MONSTER CRINGE U9G has conflicts:
        MONSTER HAYNES U9B(TOMPER HAYNES)
        MONSTER LITTLE U9G(TOMPER HAYNES)
        MONSTER THE U9B(TOMPER HAYNES)
MONSTER HAYNES U9B has conflicts:
        MONSTER LITTLE U9G(MOMA The HAYNES)(TOMPER HAYNES)
        MONSTER THE U9B(MOMA The HAYNES)(TOMPER HAYNES)
MONSTER LITTLE U9G has conflicts:
        MONSTER THE U9B(MOMA The HAYNES)(TOMPER HAYNES)

But it might be even better to change the loop construct to:

for ($i = 0; $i < @ar; $i++) {
        $match = 0;

        for ($j = 0; $j < @ar; $j++) {
                if ($i == $j) {
                        next;
                }

To get a full report for each player:

[usc@adept ~]$ ./scheds_all.pl p.txt
MONSTER CLONE U9G has conflicts:
        MONSTER HAYNES U9B(MOMA The HAYNES)
        MONSTER LITTLE U9G(MOMA The HAYNES)
        MONSTER THE U9B(MOMA The HAYNES)
MONSTER CRINGE U9G has conflicts:
        MONSTER HAYNES U9B(TOMPER HAYNES)
        MONSTER LITTLE U9G(TOMPER HAYNES)
        MONSTER THE U9B(TOMPER HAYNES)
MONSTER HAYNES U9B has conflicts:
        MONSTER CLONE U9G(MOMA The HAYNES)
        MONSTER CRINGE U9G(TOMPER HAYNES)
        MONSTER LITTLE U9G(MOMA The HAYNES)(TOMPER HAYNES)
        MONSTER THE U9B(MOMA The HAYNES)(TOMPER HAYNES)
MONSTER LITTLE U9G has conflicts:
        MONSTER CLONE U9G(MOMA The HAYNES)
        MONSTER CRINGE U9G(TOMPER HAYNES)
        MONSTER HAYNES U9B(MOMA The HAYNES)(TOMPER HAYNES)
        MONSTER THE U9B(MOMA The HAYNES)(TOMPER HAYNES)
MONSTER THE U9B has conflicts:
        MONSTER CLONE U9G(MOMA The HAYNES)
        MONSTER CRINGE U9G(TOMPER HAYNES)
        MONSTER HAYNES U9B(MOMA The HAYNES)(TOMPER HAYNES)
        MONSTER LITTLE U9G(MOMA The HAYNES)(TOMPER HAYNES)

Before I give you the final script - there is one more bug I found. When we enter something into the array, we check to see if at least one parent has a valid entry. What happens if a player has only one parent recorded?

CLOPPER HALLS U11B has conflicts:
        JAK SPRAT U13B( )
        SAM ASPEN U8G( )
        EDMUND LOWER U8B( )

Not what we want. We could modify the match check to be:

                $fmatch = ($ar[$i]{ffirst} eq $ar[$j]{ffirst}) &&
                                ($ar[$i]{flast} eq $ar[$j]{flast}) &&
                                ($ar[$i]{ffirst} ne "") &&
                                ($ar[$i]{flast} ne "");

                $mmatch = ($ar[$i]{mfirst} eq $ar[$j]{mfirst}) &&
                                ($ar[$i]{mlast} eq $ar[$j]{mlast}) &&
                                ($ar[$i]{mfirst} ne "") &&
                                ($ar[$i]{mlast} ne "");;

To wrap it up, I'm going to give the complete script which gives all of the poential conflicts. I'm actually not done with it, but the database is not finished. I really don't care about individual conflicts - I want to know about team conflicts. For example, if two siblings are on the same team and there are no other siblings, then no conflict exists. But, I'll get to that once the data has all been entered!

The getthead.pl script was presented in Some more Perl for coaching.

#! /usr/bin/perl

sub stripArray {
        my(@ar) = @_;

        for ($i = 0; $i < @ar; $i++) {
                if ($ar[$i] =~ /\^"(.\*)"$/) {
                        ($ar[$i]) = $ar[$i] =~ /"(.\*)"/;
                }
        }

        @ar;
}

do 'getthead.pl';

open(LNG_FILE, $ARGV[0]) || die "Can't open LNG_FILE: $!\\n";

# Determine the Column Names
do main'read_txtfile_format(\*LNG_FILE, \*languages); #' Hack to get color correct in vim...

my(@arSibs);

lang: while (<LNG_FILE>) {
        next lang if (/\^#/ || /\^!/);
        eval "($languages) = stripArray(split(/[,\\n]/))";

        if (($father ne "" && $flname ne "") || ($mother ne "" && $glname ne "")) {
                if ($sex eq "M") {
                        $i = "B";
                } else {
                        $i = "G";
                }

                my(%curr) = (
                        'name' => $fname,
                        'last' => $lname,
                        'agegroup' => "U" . $age . $i,
                        'ffirst' => $father,
                        'flast' => $flname,
                        'fmatch' => 0,
                        'mfirst' => $mother,
                        'mlast' => $glname,
                        'mmatch' => 0);

                push(@arSibs, \\%curr);
        }
}

close LANG_FILE;

sub sortByLast {
        $$a{last} cmp $$b{last};
}

my(@ar) = sort(sortByLast @arSibs);

for ($i = 0; $i < @ar; $i++) {
        $match = 0;

        for ($j = 0; $j < @ar; $j++) {
                if ($i == $j) {
                        next;
                }

                $fmatch = ($ar[$i]{ffirst} eq $ar[$j]{ffirst}) &&
                                ($ar[$i]{flast} eq $ar[$j]{flast}) &&
                                ($ar[$i]{ffirst} ne "") &&
                                ($ar[$i]{flast} ne "");

                $mmatch = ($ar[$i]{mfirst} eq $ar[$j]{mfirst}) &&
                                ($ar[$i]{mlast} eq $ar[$j]{mlast}) &&
                                ($ar[$i]{mfirst} ne "") &&
                                ($ar[$i]{mlast} ne "");;

                if ($fmatch || $mmatch) {
                        if ($match == 0) {
                                $match = 1;
                                print $ar[$i]{name} . " " . $ar[$i]{last} . " " . $ar[$i]{agegroup} .
                                        " has conflicts:\\n";
                        }

                        print "\\t" . $ar[$j]{name} . " " . $ar[$j]{last} . " " . $ar[$j]{agegroup};

                        if ($mmatch) {
                                print "(" . $ar[$i]{mfirst} . " " . $ar[$i]{mlast} . ")";
                        }

                        if ($fmatch) {
                                print "(" . $ar[$i]{ffirst} . " " . $ar[$i]{flast} . ")";
                        }

                        print "\\n";
                }
        }
}

exit(0);

Technorati Tags:
Orginally posted on Kool Aid Served Daily
Copyright (C) 2006, Kool Aid Served Daily
Comments:

Post a Comment:
  • HTML Syntax: NOT allowed
About

tdh

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today