if(<condition>) {
## do something
} else {
## do something else
}
if(<condition1>) {
## do something
} else if(<condition2>) {
## do something different
} else {
## do something different
}
This is a valid if/else structure.
if(x > 3) {
y <- 10
} else {
y <- 0
}
So is this one.
y <- if(x > 3) {
10
} else {
0
}
Of course, the else clause is not necessary.
if(<condition1>) {
}
if(<condition2>) {
}
for loops take an interator variable and assign it successive values from a sequence or vector. For loops are most commonly used for iterating over the elements of an object (list, vector, etc.)
for(i in 1:10) {
print(i)
}
This loop takes the i variable and in each iteration of the loop gives it values 1, 2, 3, …, 10, and then exits.
These three loops have the same behavior.
x <- c("a", "b", "c", "d")
for(i in 1:4) {
print(x[i])
}
for(i in seq_along(x)) {
print(x[i])
}
for(letter in x) {
print(letter)
}
for(i in 1:length(x)) print(x[i])
for loops can be nested.
x <- matrix(1:6, 2, 3)
for(i in 1:nrow(x)) {
for(j in 1:ncol(x)) {
print(x[i, j])
}
}
Be careful with nesting though. Nesting beyond 2–3 levels is often very difficult to read/understand.
While loops begin by testing a condition. If it is true, then they execute the loop body. Once the loop body is executed, the condition is tested again, and so forth.
count <- 0
while(count < 10) {
print(count)
count <- count + 1
}
While loops can potentially result in infinite loops if not written properly. Use with care!
Sometimes there will be more than one condition in the test.
z <- 5
while(z >= 3 && z <= 10) {
print(z)
coin <- rbinom(1, 1, 0.5)
if(coin == 1) { ## random walk
z <- z + 1
} else {
z <- z - 1
}
}
Conditions are always evaluated from left to right.
Repeat initiates an infinite loop; these are not commonly used in statistical applications but they do have their uses. The only way to exit a repeat loop is to call break.
x0 <- 1
tol <- 1e-8
repeat {
x1 <- computeEstimate()
if(abs(x1 - x0) < tol) {
break
} else {
x0 <- x1
}
}
The loop in the previous slide is a bit dangerous because there’s no guarantee it will stop. Better to set a hard limit on the number of iterations (e.g. using a for loop) and then report whether convergence was achieved or not.
next is used to skip an iteration of a loop
for(i in 1:100) {
if(i <= 20) {
## Skip the first 20 iterations
next
}
## Do something here
}
return signals that a function should exit and return a given value
Summary
Control structures like if, while, and for allow you to control the flow of an R program
Infinite loops should generally be avoided, even if they are theoretically correct.
Control structures mentiond here are primarily useful for writing programs; for command-line interactive work, the *apply functions are more useful.
R has many ways of avoiding iteration, by acting on whole objects
How many languages add 2 vectors:
c <- vector(length(a))
for (i in 1:length(a)) { c[i] <- a[i] + b[i] }
How R adds 2 vectors:
a+b
or a triple for() loop for matrix multiplication vs. a %*% b
Many functions are set up to vectorize automatically
abs(-3:3)
[1] 3 2 1 0 1 2 3
log(1:7)
[1] 0.0000000 0.6931472 1.0986123 1.3862944 1.6094379 1.7917595 1.9459101
See also apply() from last week
We'll come back to this in great detail later
ifelse(x^2 > 1, 2*abs(x)-1, x^2)
1st argument is a Boolean vector, then pick from the 2nd or 3rd vector arguments as TRUE or FALSE
if, nested if, switchfor, whileData structures tie related values into one object
Functions tie related commands into one object
In both cases: easier to understand, easier to work with, easier to build into larger things
circle.area <- function(r) { return(pi*r^2) }
circle.area(2)
[1] 12.56637
Our functions get used just like the built-in ones:
x=seq(1, 10, 1)
circle.area(x)
[1] 3.141593 12.566371 28.274334 50.265482 78.539816 113.097336
[7] 153.938040 201.061930 254.469005 314.159265
shapes=data.frame(cbind(type=c('circle', 'square', 'circle','square'), dimension=c(1 , 2 , 3 , 4)))
shapes$dimension<-as.numeric(shapes$dimension)
| type | dimension | area |
|---|---|---|
| circle | 1 | 3.141593 |
| square | 2 | 4.000000 |
| circle | 3 | 28.274334 |
| square | 4 | 16.000000 |
areas <- vector of 0s to be populated with results
for a shape in shapes:
if type is circle
then shape.area is (shape's dimension)^2*pi
add areas to results
else if type is square
then shape.area is (shape's dimension)^2
add areas to results
area<- function(my.dataframe) {
#start with getting number of entries in the table
rownum= dim(my.dataframe)[1]
#create vector that will keep the results
results<-rep(0, rownum)
#iterate through the entries calculate the area based on the type
for (i in 1:rownum){
dimension=my.dataframe[i,2]
if (my.dataframe[i,1]=='circle')
{
results[i]<-pi*dimension^2
}
else if (my.dataframe[i,1]=='square')
{
results[i]<- dimension^2
}
}
return(results)
}
m.results=area(shapes)
Interfaces: the inputs or arguments; the outputs or return value
Calls other functions ifelse(), abs(), operators
^ and >
could also call other functions we've written
return() says what the output is
alternately, return the last evaluation; I like explicit returns better
Comments: Not required by R, but a Very Good Idea
One-line description of purpose; listing of arguments; listing of outputs
will say more about design later
x <- 7
y <- c("A","C","G","T","U")
adder <- function(y) { x<- x+y; return(x) }
adder(1)
[1] 8
x
[1] 7
y
[1] "A" "C" "G" "T" "U"
circle.area <- function(r) { return(pi*r^2) }
circle.area(c(1,2,3))
[1] 3.141593 12.566371 28.274334
truepi <- pi
pi <- 3 # Valid in 1800s Indiana, or drowned R'lyeh
circle.area(c(1,2,3))
[1] 3 12 27
pi <- truepi # Restore sanity
circle.area(c(1,2,3))
[1] 3.141593 12.566371 28.274334