Building Logic:
Operators, Conditions, and Functions

IOC-R Week 4

Recap Week 3

Brief Recap Week 3

  • Data frames: row-column structure
df <- data.frame(x = 1:3, y = c("a", "b", "c"))
df
  x y
1 1 a
2 2 b
3 3 c
  • Lists: store anything
my_list <- list(x = 1:3, y = letters[5:1])
my_list
$x
[1] 1 2 3

$y
[1] "e" "d" "c" "b" "a"

What are the outputs for following codes?

df$x
df[[2]]
df[3, ]
df[, "y", drop = FALSE]
df[1, 2]


my_list$x
my_list[1]
my_list["y"]
my_list[[2]][3]

Operators

Assignment and Arithmetic Operators

  • Assignment operators: <-, ->, =
x <- 1
x
[1] 1
2 -> x # not recommanded
x
[1] 2
y = "abc" # not recommanded
y
[1] "abc"
mean(x = 1:3) # assign value to function’s parameter
[1] 2
  • Arithmetic operators: +, -, *, /, ^ (exponentiation), %% (remainder), etc.
10^3
[1] 1000
10 %% 3 # modulus (remainder) of a division operation
[1] 1
x <- c(2, 8, 3)
x - 1
[1] 1 7 2
x - 1:3
[1] 1 6 0
y <- c(6, 4, 1)
x + y
[1]  8 12  4
x * y
[1] 12 32  3
x / y
[1] 0.3333333 2.0000000 3.0000000

Comparison Operators

Element-wise comparison: ==, !=, >, <, >=, <=, return logical results.

10 == 3
[1] FALSE
10 != 3
[1] TRUE
10 > 3
[1] TRUE
10 >= 3
[1] TRUE
10 < 3
[1] FALSE
10 <= 3
[1] FALSE
x <- c(2, 8, 3)
y <- c(6, 8, 1)
x == y
[1] FALSE  TRUE FALSE
x != y
[1]  TRUE FALSE  TRUE
x > y
[1] FALSE FALSE  TRUE
x <= y
[1]  TRUE  TRUE FALSE

What are the expected results?

  • x == 8
  • y > 1
  • a is a vector of numbers 1 to 5, a[a > 3] returns?

Logical Operators

Element-wise comparison: NOT (!), AND (&), OR (|), etc., returns logical values.

!TRUE
[1] FALSE
!FALSE
[1] TRUE

To combine two conditions A and B:

A B A & B A | B
TRUE TRUE TRUE TRUE
TRUE FALSE FALSE TRUE
FALSE TRUE FALSE TRUE
FALSE FALSE FALSE FALSE

What are the expected results?

  • (3 > 1) & (7 < 5)
  • (3 > 1) | (7 < 5)
(3 > 1) & (7 < 5)
[1] FALSE
(3 > 1) | (7 < 5)
[1] TRUE

The %in% Operator

We use %in% to check if left-side values are present in right-side, it returns logical values.

## check membership
1 %in% 1:3
[1] TRUE
1:3 %in% 1
[1]  TRUE FALSE FALSE
"a" %in% c("abc", "a")
[1] TRUE

Data frame df has two columns x and y, what is the result of following code?

"col3" %in% colnames(df)

"col3" %in% colnames(df)
[1] FALSE

Bonus: Operator Precedence

Simplified precedence order1 (highest to lowest):

  1. Arithmetic (^ > *, / > +, -)
  2. Relational (>, <, >=, <=, ==, !=)
  3. Membership (%in%)
  4. Logical (! > & > |)

Order can be change using ():

1 + 2 * 3
[1] 7
(1 + 2) * 3
[1] 9
TRUE | TRUE & FALSE
[1] TRUE
(TRUE | TRUE) & FALSE
[1] FALSE

any(), all() and which()

Given a set of logical vectors:

-3:3
[1] -3 -2 -1  0  1  2  3
-3:3 > 0
[1] FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE
  • any(): is at least one of the values TRUE?
any(-3:3 > 0)
[1] TRUE
  • all(): are all of the values TRUE?
all(-3:3 > 0)
[1] FALSE
  • which(): return indices of TRUE values.
which(-3:3 > 0)
[1] 5 6 7

We have a vector log2FoldChange <- c(1.2, -0.5, 0.9, 0.7, -1.1), what are the expected results for the following codes?

  • which(log2FoldChange > 0.8)
  • any(log2FoldChange > 0.8)
  • all(log2FoldChange > 0)

Conditions

Conditional Statements

Conditional statements allow us to make decisions based on logical conditions, guiding how the code behaves in different scenarios.

flowchart LR
  A{temperature > 37} --> B(TRUE)
  A --> C(FALSE)
  B --> D[Fever]
  C --> E[Normal]

flowchart LR
  A{log2FC > 0} --> B(TRUE)
  A --> C(FALSE)
  B --> D[Up-regulated]
  C --> E[Not up-regulated]

if and if else

Syntax:

## if statement
if (condition) {
  # code to run if `condition` is TRUE
}

## if else statement
if (condition) {
  # code to run if `condition` is TRUE
} else {
  # code to run if `condition` is FALSE
}
  • You can use if without else but never in the opposite way.
  • The condition must evaluate to exactly one logical value (TRUE or FALSE) and cannot be NA.

flowchart LR
  A{log2FC > 0} --> B(TRUE)
  A --> C(FALSE)
  B --> D[Up-regulated]
  C --> E[Not up-regulated]

log2FC <- 2.5
if (log2FC > 0) {
  # code to run if condition is TRUE
  print("Up-regulated")
} else {
  # code to run if condition is FALSE
  print("Not up-regulated")
}
[1] "Up-regulated"

What will you get when log2FC is 0?

[1] "Not up-regulated"

The ifelse() Function

Syntax:

ifelse(test, yes, no)

Example:

log2FC <- -1
ifelse(test = log2FC > 0, yes = "Up-regulated", no = "Not up-regulated")
[1] "Not up-regulated"
# nested condition
ifelse(
  test = log2FC > 0,
  yes = "Up-regulated",
  no = ifelse(log2FC < 0, yes = "Down-regulated", no = "No change")
)
[1] "Down-regulated"

The “test” parameter can be a logical vector.

vec_lfc <- c(-1, 0, 1)
ifelse(vec_lfc > 0, yes = "Up-regulated", no = "Not up-regulated")
[1] "Not up-regulated" "Not up-regulated" "Up-regulated"    

Functions

What is a Function?

Functions = Reusable blocks of code.

Call a function: write function name followed by (), including any required arguments inside ().

  • Built-in functions
mean(x = 1:3)
[1] 2
  • Custom functions
add_one <- function(x) {x + 1}
add_one(x = 1:3)
[1] 2 3 4
  • Functions from additional packages
library(readr)
read_csv(file = "path/to/data.csv")

We’ll talk about packages next week!

Use ? or help() to view the function’s documentation, e.g.: ?mean, help(mean)

Anatomy of a Function

Syntax:

function_name <- function(arguments) {
  # function body (code for tasks)
  return(result)
}
  • The function name, be concise and preferably a verb.
  • The reserved token function() and argument(s).
  • The curly brackets {} frame the function body.
  • The return() (usually) at the end will return a result.

What is the composition of the following function?

add_one <- function(x) {
  result <- x + 1
  return(result)
}

Function Parameters

A function can have 0, 1 or multiple parameters, with or without default values.

  • No parameter: perform an action without needing input
say_hello <- function() {
  print("Hello!")
}
say_hello()
[1] "Hello!"
  • 1 paramter:
add_one <- function(x = 5) {
  return(x + 1)
}
add_one() # use default value of x
[1] 6
add_one(x = 1:3) # use custom values for x
[1] 2 3 4
vec <- 1:2
add_one(x = vec) # use custom values from a variable
[1] 2 3
  • Multiple parameters
divise <- function(numerator, denominator) {
  return(numerator/denominator)
}
# Provide argument values in the expected order
divise(numerator = 1, denominator = 2)
[1] 0.5
divise(1, 2)
[1] 0.5
# Provide argument values in a different order
divise(2, 1)
[1] 2
divise(denominator = 2, numerator = 1)
[1] 0.5

Function Body

Example: a function to calculate geometric mean1.

geo_mean <- function(x) {
  # calculate geometric mean
  result <- exp(mean(log(x)))
  return(result)
}
geo_mean(c(1:10, 50))
[1] 5.633703
mean(c(1:10, 50))
[1] 9.545455

But, geometric mean can only be calculated for positive numbers.

Use a condition to ensure this.

geo_mean <- function(x) {
  # keep only positive values
  if (any(x <= 0)) {
    x <- x[x > 0]
  }
  # calculate geometric mean
  result <- exp(mean(log(x)))
  return(result)
}
geo_mean(c(1:10, 50, -1, 0))
[1] 5.633703

Local vs. Global Variables

  • Local variable: defined inside the function.
  • Global variable: defined outside and accessible anywhere.
x <- 5
add_one <- function() {
  x <- 10
  return(x + 1)
}
add_one()  # Local, x is 10
[1] 11
x  # Global, x is 5
[1] 5

Let’s Practice !

Today’s Goals

  • Understand and use operators to filter data with precision.
  • Understand function’s basic structure, learn to use new functions, write functions to automate tasks.