[1] "logical"
[1] "logical"
Lecture 02
The fundamental building block of data in R are vectors (collections of related values, objects, data structures, etc).
R has two types of vectors:
atomic vectors (vectors)
true
/false
values, all numbers, or all character strings).generic vectors (lists)
R has six atomic vector types, we can check the type of any object in R using the typeof()
function
typeof() |
mode() |
---|---|
logical | logical |
double | numeric |
integer | numeric |
character | character |
complex | complex |
raw | raw |
logical
- boolean values (TRUE
and FALSE
)character
- text stringsEither single or double quotes are fine, opening and closing quote must match.
double
- floating point values (these are the default numerical type)
Atomic vectors can be grown (combined) using the combine c()
function.
typeof(x)
- returns a character vector (length 1) of the type of object x
.
mode(x)
- returns a character vector (length 1) of the mode of object x
.
is.logical(x)
- returns TRUE
if x
has type logical
.is.character(x)
- returns TRUE
if x
has type character
.is.double(x)
- returns TRUE
if x
has type double
.is.integer(x)
- returns TRUE
if x
has type integer
.is.numeric(x)
- returns TRUE
if x
has mode numeric
.is.atomic(x)
- returns TRUE
if x
is an atomic vector.is.list(x)
- returns TRUE
if x
is a list (generic vector).is.vector(x)
- returns TRUE
if x
is either an atomic or generic vector.R is a dynamically typed language – it will automatically convert between most types without raising warnings or errors. Keep in mind that atomic vectors must always contain values of the same type.
Builtin operators and functions (e.g. +
, &
, log()
, etc.) will generally attempt to coerce values to an appropriate type for the given operation
Most of the is
functions we just saw have an as
variant which can be used for explicit coercion.
R uses NA
to represent missing values in its data structures, what may not be obvious is that there are different NA
s for different atomic types.
Because NA
s represent missing values it makes sense that any calculation using them should also be missing.
Summarizing functions (e.g. sum()
, mean()
, sd()
, etc.) will often have a na.rm
argument which will allow you to drop missing values.
A useful mental model for NA
s is to consider them as a unknown value that could take any of the possible values for a type.
For numbers or characters this isn’t very helpful, but for a logical value we know that the value must either be TRUE
or FALSE
and we can use that when deciding what value to return.
These are defined as part of the IEEE floating point standard (not unique to R)
NaN
- Not a number
Inf
- Positive infinity
-Inf
- Negative infinity
Inf
and NaN
NaN
and Inf
don’t have the same testing issues that NA
s do, but there are still convenience functions for testing for these types of values
First remember that Inf
, -Inf
, and NaN
are doubles, however their coercion behavior is not the same as other doubles
What is the type of the following vectors? Explain why they have that type.
c(1, NA+1L, "C")
c(1L / 0, NA)
c(1:3, 5)
c(3L, NaN+1L)
c(NA, TRUE)
Considering only the four (common) data types, what is R’s implicit type conversion hierarchy (from highest priority to lowest priority)?
05:00
Operator | Operation | Vectorized? |
---|---|---|
x | y |
or | Yes |
x & y |
and | Yes |
!x |
not | Yes |
x || y |
or | No |
x && y |
and | No |
xor(x, y) |
exclusive or | Yes |
Almost all of the basic mathematical operations (and many other functions) in R are vectorized.
If the lengths of the vector do not match, then the shorter vector has its values recycled to match the length of the longer vector.
The same length coercion rules apply for most basic mathematical operators,
Operator | Comparison | Vectorized? |
---|---|---|
x < y |
less than | Yes |
x > y |
greater than | Yes |
x <= y |
less than or equal to | Yes |
x >= y |
greater than or equal to | Yes |
x != y |
not equal to | Yes |
x == y |
equal to | Yes |
x %in% y |
contains | Yes (over x ) |
>
& <
with charactersWhile maybe somewhat unexpected, these comparison operators can be used character values.
Conditional execution of code blocks is achieved via if
statements.
if
is not vectorizedThere are a couple of helper functions for collapsing a logical vector down to a single value: any
, all
else if
and else
if
and return
R’s if
conditional statements return a value (invisibly), the two following implementations are equivalent.
Take a look at the following code below on the left, without running it in R what do you expect the outcome will be for each call on the right?
05:00
NA
s can be particularly problematic for control flow,
NA
To explicitly test if a value is missing it is necessary to use is.na
(often along with any
or all
).
Sta 523 - Fall 2023