2.1 Control Structures
Note: Some of the material in this section is adapted from R Programming for Data Science.
The learning objectives of the section are:
- Describe the control flow of an R program
Control structures in R allow you to control the flow of execution of a series of R expressions. Basically, control structures allow you to put some “logic” into your R code, rather than just always executing the same R code every time. Control structures allow you to respond to inputs or to features of the data and execute different R expressions accordingly.
Commonly used control structures are
if
andelse
: testing a condition and acting on itfor
: execute a loop a fixed number of timesbreak
: break the execution of a loopnext
: skip an iteration of a loop
Most control structures are not used in interactive sessions, but rather when writing functions or longer expressions. However, these constructs do not have to be used in functions and it’s a good idea to become familiar with them before we delve into functions.
2.1.1 if
-else
The if
-else
combination is probably the most commonly used control structure in R (or perhaps any language). This structure allows you to test a condition and act on it depending on whether it’s true or false.
For starters, you can just use the if
statement.
if(<condition>) {
## do something
} ## Continue with rest of code
The above code does nothing if the condition is false. If you have an action you want to execute when the condition is false, then you need an else
clause.
if(<condition>) {
## do something
else {
} ## do something else
}
You can have a series of tests by following the initial if
with any number of else if
s.
if(<condition1>) {
## do something
else if(<condition2>) {
} ## do something different
else {
} ## do something different
}
Here is an example of a valid if/else structure.
## Generate a uniform random number
<- runif(1, 0, 10)
x if(x > 3) {
<- 10
y else {
} <- 0
y }
The value of y
is set depending on whether x > 3
or not.
Of course, the else
clause is not necessary. You could have a series of if clauses that always get executed if their respective conditions are true.
if(<condition1>) {
}
if(<condition2>) {
}
2.1.2 for
Loops
For loops are pretty much the only looping construct that you will need in R. While you may occasionally find a need for other types of loops, in most data analysis situations, there are very few cases where a for loop isn’t sufficient.
In R, for loops take an iterator variable and assign it successive values from a sequence or vector. For loops are most commonly used for iterating over the elements of an object (list, vector, etc.)
<- rnorm(10)
numbers for(i in 1:10) {
print(numbers[i])
}1] -0.9567815
[1] 1.347491
[1] -0.03158058
[1] 0.5960358
[1] 1.133312
[1] -0.7085361
[1] 1.525453
[1] 1.114152
[1] -0.1214943
[1] -0.2898258 [
This loop takes the i
variable and in each iteration of the loop gives it values 1, 2, 3, …, 10, executes the code within the curly braces, and then the loop exits.
The following three loops all have the same behavior.
<- c("a", "b", "c", "d")
x
for(i in 1:4) {
## Print out each element of 'x'
print(x[i])
}1] "a"
[1] "b"
[1] "c"
[1] "d" [
The seq_along()
function is commonly used in conjunction with for loops in order to generate an integer sequence based on the length of an object (in this case, the object x
).
## Generate a sequence based on length of 'x'
for(i in seq_along(x)) {
print(x[i])
}1] "a"
[1] "b"
[1] "c"
[1] "d" [
It is not necessary to use an index-type variable.
for(letter in x) {
print(letter)
}1] "a"
[1] "b"
[1] "c"
[1] "d" [
For one line loops, the curly braces are not strictly necessary.
for(i in 1:4) print(x[i])
1] "a"
[1] "b"
[1] "c"
[1] "d" [
However, curly braces are sometimes useful even for one-line loops, because that way if you decide to expand the loop to multiple lines, you won’t be burned because you forgot to add curly braces (and you will be burned by this).
2.1.2.1 Nested for
loops
for
loops can be nested inside of each other.
<- matrix(1:6, 2, 3)
x
for(i in seq_len(nrow(x))) {
for(j in seq_len(ncol(x))) {
print(x[i, j])
} }
Nested loops are commonly needed for multidimensional or hierarchical data structures (e.g. matrices, lists). Be careful with nesting though. Nesting beyond 2 to 3 levels often makes it difficult to read or understand the code. If you find yourself in need of a large number of nested loops, you may want to break up the loops by using functions (discussed later).
2.1.3 next
, break
next
is used to skip an iteration of a loop.
for(i in 1:100) {
if(i <= 20) {
## Skip the first 20 iterations
next
}## Do something here
}
break
is used to exit a loop immediately, regardless of what
iteration the loop may be on.
for(i in 1:100) {
print(i)
if(i > 20) {
## Stop loop after 20 iterations
break
} }
2.1.4 Summary
Control structures like
if
-else
andfor
allow you to control the flow of an R program.Control structures mentioned here are primarily useful for writing programs; for command-line interactive work, the “apply” functions are typically more useful.