(No, the subject isn't about odd and even numbers. )
A part of this post is complaining and a part is asking for your experiences and advice.
I'm currently reading the chapter on numbers in the Racket Reference. I've read parts of this some time ago when I started with Racket, but maybe I didn't read so thoroughly back then.
Some things I came across and find surprising:
integer? on a "seemingly float" value returns #t, e.g. (integer? 2.0). This is only mildly surprising though; I can live with this. But I found it interesting to realize how much "we" think in machine data types when thinking about data types.
(inexact->exact 1.2) doesn't truncate or round, but converts to a rational number (i.e. with nominator and denominator). Fair enough, but still a bit surprising when coming from other languages.
However, what I found strange and might have missed before, is how exactness can influence a result:
(real? 1.0+0i) gives #t, but (real? 1.0+0.0i) gives #f.
(/ 1 0) and (/ 1.0 0) raise an exception (as I had expected), but (/ 1 0.0) and (/ 1.0 0.0) return +inf.0.
(/ 0 0) and (/ 0.0 0) raise an exception, but (/ 0 0.0) and (/ 0.0 0.0) return +nan.0.
So integer? doesn't care about exactness, but real? on complex numbers does, and / cares about exactness only of the second argument.
Now a few questions:
Has any of the above behaviors of Racket got you into trouble, e.g. by making programs harder to reason about or causing bugs?
Are there other Racket number "type"-related problems that might cause problems ("type" in either the machine type sense or the more abstract/math sense)?
Are there best practices you recommend when dealing with numeric computations (integer/float/complex/mixed, exact/inexact/mixed)?
(As far as I'm concerned, we don't need to discuss float rounding errors; I'm more interested in Scheme- or Racket-specific problems.)
It is important to distinguish between the mathematical numbers, the Scheme numbers that attempt to model them, the machine representations used to implement the Scheme numbers, and notations used to write numbers.
There is no simple relationship between a number's type and its representation inside a computer.
Scheme's numerical operations treat numbers as abstract data, as independent of their representation as possible.
Scheme numbers are either exact or inexact . A number is exact if it was written as an exact constant or was derived from exact numbers using only exact operations.
A Scheme number thus consists of something that represents a mathematical number and a flag to store the (in)exactness.
Given this perspective:
The expression (integer? 2.0) is true, since 2.0 represents the mathetical number 2.
The expression (inexact->exact 1.2) evalutes to a rational number 12/10, since 12/10 represents the same mathematical number as 1.2 but has flipped the (in)exactness flag.
In (real? 1.0+0i) the imaginary part is exactly zero, so 1.0+0i lies on the real axis with certainty.
In (real? 1.0+0.0i) the imaginary part is 0.0i which is inexact zero. So 1.0+0.0i might represent a mathematical number close, but not on, the real axis.
Now the question was: why is (integer? 2.0) true and (real? 1.0+0.0i) false?
If the same reasoning were used for both functions, then (integer? 2.0) ought to give false, since 2.0 represents numbers close to 2.
I think the choice of making (integer? 2.0) true is due to:
Scheme's numerical operations treat numbers as abstract data, as independent of their representation as possible. Although an implementation of Scheme may use fixnum, flonum, and perhaps other representations for numbers, this should not be apparent to a casual programmer writing simple programs.
A naïve way of testing whether a number x is a square number: (integer? (sqrt x)).
This works with the current choice.
However the Scheme report has a few important notes on the subject:
Note The behavior of these type predicates on inexact numbers is unreliable, since any inaccuracy may affect the result.
Note: In many implementations the rational? procedure will be the same as real?, and the complex? procedure will be the same as number?, but unusual implementations may be able to represent some irrational numbers exactly or may extend the number system to support some kind of non-complex numbers.
The examples with division:
(/ 0 0) dividing by a mathematical exact zero is not defined, so it deserves an exception
(/ 0 0.0) here were are dividing by inexact zero. The inexact zero might stem from a longer computation that had a result that got rounded down to zero. So conceptually we are not dividing by exact zero, just a very small number (whose exact value, we don't know).
The last example also shows that Racket follows the IEEE standard when it comes to computing with floating points.
It might be worth mentioning that number representations are complicated in all languages
The representation (string) "1.2" is converted internally to the closest dyadic rational number (number with denominator a power of 2) with the "inexact flag" as you say; converting back to exact reveals that dyadic rational number.
You can use exact-integer? to check if a value is an integer and stored as an integer.
You can use exact-truncate, exact-floor or exact-ceiling to obtain an integer from a fractional number, which one you use depends on how you want to handle the fractional part.
This is the same behavior in C++: integer division by zero raises an exception (a SIGFPE signal in Linux), while floating point division by zero produces "inf" or "nan", but you can control that at the application level.
Python is also inconsistent: it will raise an exception if you try to divide either integers or floating point numbers by 0, however, numpy division will produce "inf" if you try to divide by zero (although it will print a warning which you can disable):
Thanks for your feedback. I now understand a few things better.
Since NumPy is a third-party library, I'd say that "Python" is more consistent "with itself" than Racket. On the other hand, Racket is more consistent with other languages that use flonums and return inf or nan for division by zero.
What hasn't been picked up yet are my questions about recommended practices (from the end of my initial post).
From what I've read so far, I would now say that it's advisable to normalize input data to the types they're supposed to have - as soon as the data is received. Depending on the context, instead you may want to raise an error if a value doesn't have the expected exactness. This can be complemented by contracts to make sure that, for example, a procedure that is supposed to take a float value actually gets a float value.
Rationale: For example, if I have these two rows/records of input values,
1 2.3 4
5 6 7
and I put each record through the same algorithm without a previous conversion, I might get 6.468137108187422 for the first row and 36141965775283752674541165588781666103846889/25473495682521985696548866286700877509677673 for the second row. Both rows will use different handling of division by zero (and maybe other different behavior). Actually, for the first row, different parts of the calculation might use different division-by-zero handling. Generally, I'd want to minimize the number of possible (including internal) code paths and behavior to make the code easier to reason about and less bug-prone.