if(<condition>) {
## do something
} else {
## do something else
}
if(<condition1>) {
## do something
} else if(<condition2>) {
## do something different
} else {
## do something different
}
This is a valid if/else structure.
if(x > 3) {
y <- 10
} else {
y <- 0
}
So is this one.
y <- if(x > 3) {
10
} else {
0
}
Of course, the else clause is not necessary.
if(<condition1>) {
}
if(<condition2>) {
}
for
loops take an interator variable and assign it successive values from a sequence or vector. For loops are most commonly used for iterating over the elements of an object (list, vector, etc.)
for(i in 1:10) {
print(i)
}
This loop takes the i
variable and in each iteration of the loop gives it values 1, 2, 3, …, 10, and then exits.
These three loops have the same behavior.
x <- c("a", "b", "c", "d")
for(i in 1:4) {
print(x[i])
}
for(i in seq_along(x)) {
print(x[i])
}
for(letter in x) {
print(letter)
}
for(i in 1:length(x)) print(x[i])
for
loops can be nested.
x <- matrix(1:6, 2, 3)
for(i in 1:nrow(x)) {
for(j in 1:ncol(x)) {
print(x[i, j])
}
}
Be careful with nesting though. Nesting beyond 2–3 levels is often very difficult to read/understand.
While loops begin by testing a condition. If it is true, then they execute the loop body. Once the loop body is executed, the condition is tested again, and so forth.
count <- 0
while(count < 10) {
print(count)
count <- count + 1
}
While loops can potentially result in infinite loops if not written properly. Use with care!
Sometimes there will be more than one condition in the test.
z <- 5
while(z >= 3 && z <= 10) {
print(z)
coin <- rbinom(1, 1, 0.5)
if(coin == 1) { ## random walk
z <- z + 1
} else {
z <- z - 1
}
}
Conditions are always evaluated from left to right.
Repeat initiates an infinite loop; these are not commonly used in statistical applications but they do have their uses. The only way to exit a repeat
loop is to call break
.
x0 <- 1
tol <- 1e-8
repeat {
x1 <- computeEstimate()
if(abs(x1 - x0) < tol) {
break
} else {
x0 <- x1
}
}
The loop in the previous slide is a bit dangerous because there’s no guarantee it will stop. Better to set a hard limit on the number of iterations (e.g. using a for loop) and then report whether convergence was achieved or not.
next
is used to skip an iteration of a loop
for(i in 1:100) {
if(i <= 20) {
## Skip the first 20 iterations
next
}
## Do something here
}
return
signals that a function should exit and return a given value
Summary
Control structures like if
, while
, and for
allow you to control the flow of an R program
Infinite loops should generally be avoided, even if they are theoretically correct.
Control structures mentiond here are primarily useful for writing programs; for command-line interactive work, the *apply functions are more useful.
R has many ways of avoiding iteration, by acting on whole objects
How many languages add 2 vectors:
c <- vector(length(a))
for (i in 1:length(a)) { c[i] <- a[i] + b[i] }
How R adds 2 vectors:
a+b
or a triple for()
loop for matrix multiplication vs. a %*% b
Many functions are set up to vectorize automatically
abs(-3:3)
[1] 3 2 1 0 1 2 3
log(1:7)
[1] 0.0000000 0.6931472 1.0986123 1.3862944 1.6094379 1.7917595 1.9459101
See also apply()
from last week
We'll come back to this in great detail later
ifelse(x^2 > 1, 2*abs(x)-1, x^2)
1st argument is a Boolean vector, then pick from the 2nd or 3rd vector arguments as TRUE
or FALSE
if
, nested if
, switch
for
, while
Data structures tie related values into one object
Functions tie related commands into one object
In both cases: easier to understand, easier to work with, easier to build into larger things
circle.area <- function(r) { return(pi*r^2) }
circle.area(2)
[1] 12.56637
Our functions get used just like the built-in ones:
x=seq(1, 10, 1)
circle.area(x)
[1] 3.141593 12.566371 28.274334 50.265482 78.539816 113.097336
[7] 153.938040 201.061930 254.469005 314.159265
shapes=data.frame(cbind(type=c('circle', 'square', 'circle','square'), dimension=c(1 , 2 , 3 , 4)))
shapes$dimension<-as.numeric(shapes$dimension)
type | dimension | area |
---|---|---|
circle | 1 | 3.141593 |
square | 2 | 4.000000 |
circle | 3 | 28.274334 |
square | 4 | 16.000000 |
areas <- vector of 0s to be populated with results
for a shape in shapes:
if type is circle
then shape.area is (shape's dimension)^2*pi
add areas to results
else if type is square
then shape.area is (shape's dimension)^2
add areas to results
area<- function(my.dataframe) {
#start with getting number of entries in the table
rownum= dim(my.dataframe)[1]
#create vector that will keep the results
results<-rep(0, rownum)
#iterate through the entries calculate the area based on the type
for (i in 1:rownum){
dimension=my.dataframe[i,2]
if (my.dataframe[i,1]=='circle')
{
results[i]<-pi*dimension^2
}
else if (my.dataframe[i,1]=='square')
{
results[i]<- dimension^2
}
}
return(results)
}
m.results=area(shapes)
Interfaces: the inputs or arguments; the outputs or return value
Calls other functions ifelse()
, abs()
, operators
^
and >
could also call other functions we've written
return()
says what the output is
alternately, return the last evaluation; I like explicit returns better
Comments: Not required by R, but a Very Good Idea
One-line description of purpose; listing of arguments; listing of outputs
will say more about design later
x <- 7
y <- c("A","C","G","T","U")
adder <- function(y) { x<- x+y; return(x) }
adder(1)
[1] 8
x
[1] 7
y
[1] "A" "C" "G" "T" "U"
circle.area <- function(r) { return(pi*r^2) }
circle.area(c(1,2,3))
[1] 3.141593 12.566371 28.274334
truepi <- pi
pi <- 3 # Valid in 1800s Indiana, or drowned R'lyeh
circle.area(c(1,2,3))
[1] 3 12 27
pi <- truepi # Restore sanity
circle.area(c(1,2,3))
[1] 3.141593 12.566371 28.274334