Safe Conversion from Numeric to Integer

Validating numbers before converting them to integers

To save on memory space, sometimes you may wish to convert a numeric column to an integer column if it is safe to do so.

Consider this small data set total_cacs_dataset consisting of the patient’s total coronary calcium Agatston score.

Dangers of Unsafe Integer Conversion

If you take a look at the class (data type) of columns CACS Numeric Safe 1, CACS Numeric Safe 2, and CACS Numeric Unsafe, you will see that they are all set to numeric

We want to check in an automatic way if these column values can be safely be converted into type integer using the function as.integer() as doing this inaccurately can lead to a loss of precision. Observe that 746.9, 0.5 and 95.9 are converted to 746, 0 and 95.

Check if a numeric column is safe for integer conversion

While pointblank does not have a function that do that, we can create a customised validation that uses a predicate expression (a conditional R script of expression that returns a vector containing TRUE, FALSE or NA) and then use the function pointblank::col_vals_exp to identify which rows does not meet the predicate expression.

The predicate expression can be done using the function checkmate::testIntegerish which returns FALSE if the input vector cannot be safely convertible to integer. To ensure that we have a TRUE or FALSE output for each element in the input vector, we use the function purrr::map_lgl. Here is an example using the column CACS Numeric Unsafe as an input to returns a vector containing TRUE, FALSE or NA.

We then use pointblank::col_vals_exp with the the predicate expression to isolate rows that cannot be safely convert to integer in the column CACS Numeric Unsafe.

Here is an example (using columns CACS Numeric Safe 1 and CACS Numeric Safe 2) when the verification has no error allowing us to safely do the conversion.

What about NA values

If there is need to also catch rows with NA values, using the argument na_pass = FALSE in pointblank::col_vals_exp will not work.

This is because checkmate::testIntegerish will return TRUE if the input is NA. Thus the predicate expression will never return an NA value.

One way to overcome this is to add another validation using the function pointblank::col_vals_not_null