Help writing flexible splits, perl -


a couple weeks ago posted question trouble having parsing irregularly-formatted data file. here's sample of data:

01-021412 15/02/2007  207,000.00 14,839.00  18       -6     2     6     6     5    16     6     4     4     3   -28   -59   -88  -119                                                      -149  -191  -215  -246                   atraso promedio --->        2.88 

i need program extract 01-021412, 18, count , sum digits in subsequent series, , store atraso promedio, , repeat operation on 40,000 entires. received helpful response, , able write code:

use strict; use warnings;  #create output file open(out, ">outfull.csv"); print out "loanid,npayments,atrasopromedio,atrasoalt,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72\n";  open(myinputfile, "<datos historico aspire2.txt");  @payments; $numberofpayments; $loannumber;  while(<myinputfile>) {     if(/\b\d{2}-\d{6}\b/)     {         ($loannumber, undef, undef, undef, $numberofpayments, @payments) = split;     }     elsif(m/---> *(\d*.\d*)/)     {         (undef, undef, undef, $atrasopromedio) = split;         $n = scalar @payments;         print "$numberofpayments,$n,$loannumber\n";          if($n==$numberofpayments){          $total = 0;          ($total+=$_) @payments;           $atrasoalt = $total/$n;           print out "$loannumber,$numberofpayments,$atrasopromedio,$atrasoalt,",join( ',', @payments),"\n";        }     }     else     {         push(@payments, split);     } } 

this work fine, except fact 50 percent of entries include '*' follows:

* 01-051948 06/03/2009  424,350.00 17,315.00  48        0     6    -2     0    21    10     9    13    10     9     7    13     3     4                                                         12    -3    14     8     6        atraso promedio --->        3.02 

the asterisk causes program fail because interrupts split pattern, causing incorrect variable assignments. until i've dealt removing asterisks input data file, realized doing program omits these loans altogether. there economical way modify script handles entries , without asterisks?

as aside, if entry include asterisk record fact in output data.

many in advance, aaron

use intermediate array:

my $has_asterisk;  # ...  if(/\b\d{2}-\d{6}\b/) {     @fields = split;     $has_asterisk = $fields[0] eq '*';     shift @fields if $has_asterisk;     ($loannumber, undef, undef, undef, $numberofpayments, @payments) = @fields; } 

Comments

Popular posts from this blog

apache - Add omitted ? to URLs -

redirect - bbPress Forum - rewrite to wwww.mysite prohibits login -

php - How can I stop spam on my custom forum/blog? -