Imagine you have a file with multiple columns and you need to quickly get the total of a column in command line in UNIX. The easiest way is to use AWK in the command line.

Let’s take the following file:

1 2 3 4 5 6 7 8 9
2 3 4 5 6 7 8 9 10
3 4 5 6 7 8 9 11 12

To get the sum of column 7 (7 + 8 + 9) execute the following command:

$ awk '{sum+=$7} END {print sum}' datafile.txt
24

Let’s just do a little more than simply getting the sum. We now write a small script to calculate the sum of a column and also print a running total.

Input

2010 191 291 391 491 591 691 791 891 991
2011 144 286 391 491 591 691 791 891 991
2012 112 254 354 454 554 654 754 854 954
2013 191 291 391 491 591 691 791 891 991
2014 191 291 391 491 591 691 791 891 991
2015 178 291 391 491 591 691 791 891 991

Program

#!/bin/awk -f
# scriptname: calc_sum
# input: column number, data file
BEGIN {
    printf ("\tValue\tRunning total\n")
}
{
    sum += $colnum;
    printf ("%s:\t%.2f\t%.2f\n", $1, $colnum, sum)
}
END {
    printf ("Total:%.2f\n", sum)
}
$ calc_sum colnum=8 datafile2.txt
Value   Running total
2010:   791.00  791.00
2011:   791.00  1582.00
2012:   754.00  2336.00
2013:   791.00  3127.00
2014:   791.00  3918.00
2015:   791.00  4709.00
Total:  4709.00

Of course you can run the above program from the command line, but it is neater and readable this way.


0 Comments

Leave a Reply

Avatar placeholder

Your email address will not be published. Required fields are marked *