[R] number of decimal places in a number?
(Ted Harding)
Ted.Harding at wlandres.net
Sat Jul 7 14:12:34 CEST 2012
I had thought of also (as well as my numerical routing) suggesting
a "gsub()" type solution like Joshua's below, but held back because
the result could depend on how the number arose (keyboard input,
file input, or from computation within R).
However, I now also realise that (again because of binary rounding
errors), the "gsub()" method has interesting differences from my
numerical method. Example:
[A] (as from my original method):
f(123456789.123456789)
# [1] 7
[B] (the "gsub()" method)
nchar(gsub("(.*\\.)|([0]*$)", "", as.character(123456789.123456789)))
# [1] 6
Now look at:
[C] (what as.character() does to 123456789.123456789)
as.character(123456789.123456789)
# [1] "123456789.123457"
[D] ("22" is the maximum number of decimal digits for print())
print(123456789.123456789,22)
# [1] 123456789.1234568
So as.character() has rounded it to 6 decimal places (agreeing
with [B]), while using print() with the maximum of 22 digits
(more than enough for the 18 digits in 123456789.123456789)
rounds it to 7 decimal places (i.e. 16 digits in all), which
is about the limit (depending on the magnitude of the number)
that R can hold internally; this agrees with [A].
Note the difference between
[D] ("22" is the maximum number of decimal digits for print())
print(123456789.123456789,22)
# [1] 123456789.1234568
[E] (similar, but with a different magnitude)
print(923456789.123456789,22)
# [1] 923456789.123457
(compare with [C]).
So, clearly, there is potential uncertainty in the ouput
from either method, but perhaps there is somewhat more
uncertainty with the "gsub()" method.
Also, another nasty little trap with "gsub()":
[F] (my method)
f(0.0000012345)
# [1] 10
[G] ("gsub()" method)
nchar(gsub("(.*\\.)|([0]*$)", "", as.character(0.0000012345)))
# [1] 8
which arises because:
as.character(0.0000012345)
# [1] "1.2345e-06"
There would seem to be no clean general solution to this
question. An important issue would be: What use do you
want to put the result to?
If there is something in the logic of your application
which depends critically on the numbers of decimal places
in its numerical input, then the final result could be
completey wrong because of these uncertainties.
In such a case, it might be best to force initial input
to be of character format. For example, if reading numerical
data into a dataframe from (say) a CSV file, then the option
Data <- read.csv("datafile.csv",colClasses="character")
(or similar) would convert all numerical data into the equivalent
character formats. Then Joshua's "gsub()" method would always
give exactly the right result when applied to these character
strings. Then, having got that out of the way, you can convert
the character strings into numeric (to within the precision
that R will allow).
However, if something in the logic depends critically on the
numbers of "decimal places" in numbers computed internally by R,
then I think the case is hopeless!
Ted.
On 07-Jul-2012 10:44:55 Joshua Wiley wrote:
> Hi Martin,
>
> Ted is spot on about the binary representation. A very different
> approach from his would be to convert to character and use regular
> expressions:
>
>## the example numbers in a vector
> x <- c(3.14, 3.142, 3.1400, 123456.123456789, 123456789.123456789, pi,
> sqrt(2))
>
> nchar(gsub("(.*\\.)|([0]*$)", "", as.character(x)))
>
> which for me returns:
> [1] 2 3 2 9 6 14 13
>
> an advantage of this approach is that for numbers like
> 123456789.123456789, although R cannot represent it properly as a
> binary number, the character string is totally fine.
>
> nchar(gsub("(.*\\.)|([0]*$)", "", "123456789.123456789"))
>
> returns 9
>
> Essentially the expression looks for anything (the period) zero or
> more times (the *) followed by an actual period (the \\.) OR 0
> repeated zero or more times at the end of the string, and replaces all
> of those with nothing (the "") and then returns the result, the number
> of characters of which is counted by nchar()
>
> See ?regex for details
>
> Cheers,
>
> Josh
>
> On Sat, Jul 7, 2012 at 3:04 AM, Ted Harding <Ted.Harding at wlandres.net> wrote:
>> On 07-Jul-2012 08:52:35 Martin Ivanov wrote:
>>> Dear R users,
>>>
>>> I need a function that gets a number and returns its number of
>>> actual decimal places.
>>> For example f(3.14) should return 2, f(3.142) should return 3,
>>> f(3.1400) should also return 2 and so on. Is such function already
>>> available in R? If not, could you give me a hint how to achieve that?
>>>
>>> Many thanks in advance.
>>
>> I'm not aware of such a function in R. In any case, it will be
>> a tricky question to solve in full generality, since R stores
>> numbers internally in a binary representation and the exact
>> conversion of this representation to a decimal number may not
>> match the exact value of the decimal representation of the
>> original number.
>>
>> In particular, a number entered as a decimal representation
>> from the keyboard, or read as such from a text file, may not
>> be exactly matched by the internal representation in R.
>>
>> However, that said, the following function definition seems to
>> do what you are asking for, for cases such as you list:
>>
>> f<-function(x) {min(which( x*10^(0:20)==floor(x*10^(0:20)) )) - 1}
>>
>> f(3.14)
>> # [1] 2
>> f(3.142)
>> # [1] 3
>> f(3.1400)
>> # [1] 2
>>
>>
>>
>> Note, however:
>>
>> f(123456.123456789)
>> # [1] 9
>>
>> f(123456789.123456789)
>> #[1] 7
>>
>> (a consequence of the fact that R does not have enough binary
>> digits in its binary representation to accommodate the precision
>> in all the decimal digits of 123456789.123456789 -- not that it
>> can do that exactly anyway in binary, no matter how many binary
>> digits it had available).
>>
>> Similarly:
>>
>> f(pi)
>> # [1] 15
>> f(sqrt(2))
>> # [1] 16
>>
>> which is a consequence of the fact that 2 < pi < 4, while
>> 1 < sqrt(2) < 2, so the binary representation of pi needs
>> 1 more binary digit for its integer part than sqrt(2) does,
>> which it therefore has to "steal" from the fractional part.
>>
>> Hoping this helps,
>> Ted.
>>
>> -------------------------------------------------
>> E-Mail: (Ted Harding) <Ted.Harding at wlandres.net>
>> Date: 07-Jul-2012 Time: 11:04:26
>> This message was sent by XFMail
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
> Joshua Wiley
> Ph.D. Student, Health Psychology
> Programmer Analyst II, Statistical Consulting Group
> University of California, Los Angeles
> https://joshuawiley.com/
-------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at wlandres.net>
Date: 07-Jul-2012 Time: 13:12:31
This message was sent by XFMail
More information about the R-help
mailing list