R – Functions | Statistical Consulting Group

Functions are methods of isolating tasks so that they can be repetitively applied. They follow a basic structure.

name = function(inputs in function scope){  # function declaration
  
  body # Here, There be Dragons!
  
  return(output)  # spits back the output
}

 # sometime later

name(inputs in global scope)  # function call

The Name of the function is the unique identifier of that function. Within R, we can create functions named whatever we want

Inputs are variables passed into a function. We can pass any number of variables we want. We can set default values for variables passed so that if we run a function repetitively with one of the inputs not changing, we can set that input to have a default value and not mention it in the function call.

When outside variables are passed into the function, they are assigned in-function scope variable names. Then within the function, these variables are manipulated without affecting the variables outside the function.

The return statement allows us to pass a variable back out from the function. This variable can be any type of data structure we may need. Now let’s look at a simple function.’

hw = function(){
  return("Hello World!")
}

a = hw()  # call the function, pass the output into new variable a
print(a)  # Print a

Now, we can write a slightly more advanced function to use input variables and perform manipulation on them.

bitstrings = function(n){ # declare bitstrings with the input n
  return(2^n)             # returns value of 2 to the n
}

print(bitstrings(8)) # prints the results of the call with n = 8

Because n = 8, we get a result of 256. There are 256 possible unique bitstrings when using 8 bits.

Functions are very useful if we want to only work on a subset of a given dataset at a time. We can use bracket notation to subset the dataset on a given variable (or set of variables), and then pass only that subset to the function for further analysis. I will expand more on this in the bracket notation section.

A good example of a function will be the body mass index Function. We pass it numeric inputs of height (in inches) and weight (in pounds). It follows the standard formula for BMI, and returns the value.

BMI = function(height,weight){
  return(0.45455*weight/(.0254*height)^2)
}
BMI(71,180)

Because the contents of the R-function can be done on 1 line without declaring any temporary variables, we can choose to omit the return() statement and define the function on one line. The following code is equivalent to the previous code.

BMI = function(height,weight){(0.45455*weight/(.0254*height)^2)}

This returns 25.1765, the BMI for a person who is 71 inches tall and weighs 180 lbs.

If we were so inclined, we could use some properties of more advanced data structures. Let us say we have the height and weight of several people. We can add another column to the data frame using the function.

h = c(68,70,65,74,69)
w = c(185,162,122,224,154)
people = data.frame(h,w)
people$bmi = BMI(people$h,people$w)
print(people)

So we can run multiple lines through custom functions without iterating between the lines.