3

I have a file on a Linux machine, with one number per line:

 3073771824
 1513517589
 4982173058
 1539400944
 3175320163
 5247018359
14359014635
....

How can I multiply each by 1.03 (3%) and then round up the result, removing the comma? For example:

 3073771824 * 1,03 = 3165984978,72 | I would have the result 3165984979
 1513517589 * 1,03 = 1558923116,67 | I would have the result 1558923117
....

I know that awk can round:

echo 2.5 | awk '{print int($1+0.5)}'

rounds up to 3 but there are "." and not ",".

5
  • 2
    What does locale return,? Commented Jul 11 at 9:33
  • Not sure I understand the question here. I think that the decimal separator is defined in your locale (in Italy is a comma, in US is a dot, etc) so you can execute all in awk using the dot => echo 2.42 | awk '{print int(($1 * 1.03) +0.5)}' =2 <= while => echo 2.43 | awk '{print int(($1 * 1.03) +0.5)}' = 3 <= Instead if the problem is that your input data has commas instead of dots you can use tr to translate => echo 2,42 | tr "[,]" "[.]" | awk '{print int(($1 * 1.03) +0.5)}'
    – sigmud
    Commented Jul 11 at 9:33
  • as @ChrisDavies correctly suggested to get your current settings you can launch this command => locale decimal_point
    – sigmud
    Commented Jul 11 at 9:36
  • @sigmud I want to see the locale name not its value Commented Jul 11 at 9:37
  • 1
    Is your input always integer? Commented Jul 11 at 17:59

7 Answers 7

8

Let's assume your numbers are in a file numbers, and that your locale is one that uses , as a decimal point such as fr_FR.UTF-8. I have also added 10000000000 as one of your source numbers to demonstrate that rounding up does not occur for integer results:

export LC_ALL=fr_FR.UTF-8                # Change my locale

awk '
    {
        n=$1*1.03;                       # Multiply input value
        printf "%d\t%f\t", $1, n;        # DEBUG: show input and multiplied values
        if (n>int(n)) { n=int(n)+1 };    # Round up +ve if not an integer value
        print n;                         # Output integer result
    }
' numbers

If input numbers can be negative the rounding code would need to be more complex so that it rounded away from zero. Let me know if this might be an issue and I'll include code to address it.

Result, including the debug columns, using your data

3073771824      3165984978.720000       3165984979
1513517589      1558923116.670000       1558923117
4982173058      5131638249.740000       5131638250
1539400944      1585582972.320000       1585582973
3175320163      3270579767.890000       3270579768
5247018359      5404428909.770000       5404428910
14359014635     14789785074.050001      14789785075
10000000000     10300000000.000000      10300000000

Result without debug, using your data

3165984979
1558923117
5131638250
1585582973
3270579768
5404428910
14789785075
10300000000

Remove or comment out the printf line to remove the two debugging columns. You can crash the awk command onto a single line if you must, by removing the comments and concatenating the remaining lines.

Now, regarding the locale. The awk program expects its numeric constants to use . as a decimal point regardless of locale; this is a syntactic requirement. For example print 3,1 prints two values separated by a space (3 1) but print 3.1 prints a single decimal value (3.1).

If you are using GNU gawk you can specify the -N flag (--use-lc-numeric) to have it read and write data values, and convert strings to numbers using your locale:

echo 1000 1,3005 | gawk -N '{ print $1 * $2 }'
1300,5

echo 1000 | gawk -N '{ print "1,03" * $1 }'
1030

However, in your question you are reading and writing integer values so the locale's decimal point is not relevant.

2
7

To be clear, in echo 2.5 | awk '{print int($1+0.5)}' in your question, that awk code is not rounding up to 3, it's adding 0.5 and then rounding down to 3 (but it's already 3 so int() isn't actually rounding it). If you started with 2.4 instead of 2.5 the result would be 2, not 3 as you'd expect from rounding up, because int() always rounds down.

Using any awk:

$ awk '
    function ceil(x,   y) { y=int(x); return(x>y ? y+1 : y) }
    { print ceil($0 * 1.03) }
' file
3165984979
1558923117
5131638250
1585582973
3270579768
5404428910
14789785075

If you use , rather than . as the decimal point in your locale it is possible to use 1,03 instead of 1.03 in your calculation but I wouldn't recommend it (I'd recommend using LC_ALL=C by default) as that requires different code in different awk variants and is not as straight-forward as it sounds, e.g. with GNU awk we need to add -N to tell gawk to use your locale for this, and then write 1,03 as a string "1,03" instead of a literal number and rely on using it in an arithmetic context (multiplying) to convert it to a number because a literal , means different things in an awk script:

$ LC_ALL='fr_FR' awk -N '
    function ceil(x,   y) { y=int(x); return(x>y ? y+1 : y) }
    { print ceil($0 * "1,03") }
' file
3165984979
1558923117
5131638250
1585582973
3270579768
5404428910
14789785075

See https://www.gnu.org/software/gawk/manual/gawk.html#Locale-influences-conversions for more information on that.

In general, though, what you're asking for is a ceil() (for "ceiling") function as shown above. It's important to include zero and negative numbers in your example when you're looking for any kind of rounding function as it's easy to get them wrong so using this input file:

$ cat file
1.999999
1.0
0.000001
0
-0.000001
-1.0
-1.999999

we can test a ceil() function:

$ awk 'function ceil(x, y){y=int(x); return(x>y?y+1:y)} {print $0,ceil($0)}' file
1.999999 2
1.0 1
0.000001 1
0 0
-0.000001 0
-1.0 -1
-1.999999 -1

and the opposite floor() function:

$ awk 'function floor(x, y){y=int(x); return(x<y?y-1:y)} {print $0,floor($0)}' file
1.999999 1
1.0 1
0.000001 0
0 0
-0.000001 -1
-1.0 -1
-1.999999 -2

The above works because int() truncates towards zero (from the GNU awk manual):

int(x)
    Return the nearest integer to x, located between x and zero and
    truncated toward zero. For example, int(3) is 3, int(3.9) is 3,
    int(-3.9) is -3, and int(-3) is -3 as well.

so int() of a negative number already does what you want for a ceiling function, i.e. round up, and you just have to add 1 to the result if int() rounded down a positive number.

I used 0.000001, etc. in the samples to avoid people getting a false positive testing a solution that adds some number like 0.9 and then int()ing.

Also note that ceil() could be abbreviated to:

function ceil(x){return int(x)+(x>int(x))}

but I wrote it as above for clarity (it's not clear/obvious that the result of x>int(x) is 1 or 0) and efficiency (only call int() once instead of twice).

5

Using Miller:

$ mlr --nidx put '$1 = round($1 * 1.03)' file

To round numbers to the upper integer, the following command may be used.

$ mlr --nidx put '$1 = ceil($1 * 1.03)' file
3

I'm certainly biased towards using Perl, as it's often my tool of choice to do stuff, but short of using Bash itself or simpler tools like bc, which don't handle rounding up natively (of course it could still be easily implemented, but still), Perl may have the simplest way using ceil() from the POSIX module. Because of how Perl works, the padding apparently leading the numbers will be automatically and gracefully handled:

perl -MPOSIX -lne 'print(ceil($_ * 1.03))' file
% cat file
 3073771824
 1513517589
 4982173058
 1539400944
 3175320163
 5247018359
14359014635
% perl -MPOSIX -lne 'print(ceil($_ * 1.03))' file
3165984979
1558923117
5131638250
1585582973
3270579768
5404428910
14789785075

If you wish to preserve the padding (you'll need to adjust 11 if the resulting numbers may end up being longer than 11 digits):

perl -MPOSIX -ne 'printf("%11d\n", ceil($_ * 1.03))' file
% perl -MPOSIX -ne 'printf("%11d\n", ceil($_ * 1.03))' file
 3165984979
 1558923117
 5131638250
 1585582973
 3270579768
 5404428910
14789785075
3

Using Raku (formerly known as Perl_6)

~$ raku -ne 'put ceiling($_ * 1.03)'   file

#OR

~$ raku -ne 'ceiling($_ * 1.03).put'   file

#OR

~$ raku -ne '($_ * 1.03).ceiling.put'  file

Raku can be used in a similar manner to the excellent Perl answer by @kos. Raku's ceiling routine is built-in, so no module has to be called at the command line. The put call adds a newline terminator for you, and in the first example extra parens to put are not required.

Sample Input:

 3073771824
+3073771824
-3073771824
 1513517589
 4982173058
 1539400944
 3175320163
 5247018359
14359014635

Sample Output:

3165984979
3165984979
-3165984978
1558923117
5131638250
1585582973
3270579768
5404428910
14789785075

This raises the question of what to do if a non-Integer and/or non-numeric is found in the input. Skipping that line is the easiest, but (if that causes misalignment of columnar data), a placeholder like "NaN" can be used. Exemplary fixes below (e.g. use match operator, or coerce via +$_/$_.Num):

~$ printf '3073771824\n' | raku -ne 'ceiling($_ * 1.03).put;'
3165984979
~$ printf '3073771824\n' | raku -ne 'ceiling($_ * 1.03).put with $_.match(/^ [\- | \+]? <[0..9]>+ $/); #regex check'
3165984979
~$ printf '3073771824\n' | raku -ne 'ceiling($_ * 1.03).put with +$_; #check numification'
3165984979
 % printf '3073771824\n' | raku -ne 'ceiling($_ * 1.03).put with  $_.Num; #check numification'
3165984979
~$ printf '3073771824A\n' | raku -ne 'ceiling($_ * 1.03).put;'
Cannot convert string to number: trailing characters after number in '3073771824⏏A' (indicated by ⏏)
  in block <unit> at -e line 1

~$ #handle failed cases below:
~$ printf '3073771824A\n' | raku -ne 'ceiling($_ * 1.03).put with +$_; #no error (line is skipped)'
~$ printf '3073771824A\n' | raku -ne ' m/^ [\- | \+]? <[0..9]>+ $/ ?? ($_ * 1.03).ceiling.put !! "NaN".put;'
NaN
~$ printf '3073771824A\n' | raku -ne ' $_.Num.defined ?? ($_ * 1.03).ceiling.put !! "NaN".put;'
NaN
~$

https://docs.raku.org/routine/ceiling
https://raku.org

6
  • 1
    I wonder why something so basic such as ceil hasn't been made part of standard modules in Perl (or am I simply unaware of it)? Regardless, I like Raku, it's too bad it's not as widespread... And is the .put after ceiling some sort of method chaining? Implying that ceiling will return some sort of object?
    – kos
    Commented Jul 12 at 19:08
  • 1
    (I actually meant part of the language, not part of standard modules, which of course it is already...)
    – kos
    Commented Jul 12 at 19:15
  • 1
    Yes, the .put is a method chain. I will add a third answer at top to make that clearer: raku -ne '($_ * 1.03).ceiling.put'. Cheers. Commented Jul 12 at 19:26
  • I'll let a Perl expert chime in, possibly @choroba ? Commented Jul 12 at 19:44
  • 1
    I don't think they'll be pinged by the @ if they didn't partecipate in the conversation (maybe they'll see the comment...), however on a second thought, it kinda makes sense, the same goes for awk and other scripting languages, it's not uncommon to import some math library to do this kind of stuff... However I like that Raku brings it on the table by default, I don't recall a small project where I haven't needed to round up / down stuff at some point. Thanks for the explanation about put and for outlining the third method! Maybe I'll get my feet wet with Raku one day...
    – kos
    Commented Jul 12 at 20:00
2

you could use a script in which numbers are your input. Here's an example of a function that would help.

for number in "${numbers[@]}"; do
  result=$(awk -v num="$number" 'BEGIN { printf "%d\n", (num * 1.03 == int(num * 1.03) ? num * 1.03 : int(num * 1.03) + 1) }')

  echo "$result"

done
2
  • 1
    Executing num * 1.03 3 times for every input value and calling int() twice for some input values seems a bit inefficient.
    – Ed Morton
    Commented Jul 11 at 15:15
  • 1
    (1) Executing a fresh Awk process for each data value also seems a bit inefficient.  (2) Nit-pick: you say you are providing “input” to the command, but, in fact, you are providing a parameter to the command. Commented Jul 12 at 20:48
0

It's been a long, long time since I used awk, and it's been even longer since I learned about integer math.

You'll have to check if your numbers don't exceed limits, but...

x = ((x * 103) + 99) / 100

will give you an integer value that is 103% of integer x rounded up.

If you need to round away from zero and may have negative values to begin with, use some logic to subtract 99 instead of adding 99 (for positive values).

Simple examples:

1.03% of 50:
 50 * 103 = 5150  ||  5150 + 99 = 5249  ||  5249 / 100 = 52 (integer div aka truncation)

1.03% of 100:
100 * 103 = 10300  ||  10300 + 99 = 10399  ||  10399 / 100 = 103 (no surprise)

1.03% of 101:
101 * 103 = 10403  ||  10403 + 99 = 10502  ||  10502 / 100 = 105 (104.03 ==> 105)

Try it yourself with some numbers.

Also, look into utilities dc and bc, two simple calculator programs. For simple data transformation such as this, these might be even easier than tackling the richness of awk.

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .