# Writing Pipe-friendly Functions

Pipes have been a fundamental aspect of computer programming for many decades. In short, the semantics of pipes can be thought of as taking the output from the left-hand side and passing it as input to the right-hand side. For example, in a linux shell, you might `cat example.txt | sort | uniq` to take the contents of a text file, then sort the rows, then take one copy of each distinct value. `|` is a common, but not universal, pipe operator and on U.S. Qwerty keyboards, is found above the RETURN key along with the backslash: `\`.

Languages that don’t begin by supporting pipes often eventually implement some version of them. In R, the magrittr package introduced the `%>%` infix operator as a pipe operator and is most often pronounced as “then”. For example, “take the `mtcars` data.frame, THEN take the `head` of it, THEN…” and so on.

# Three Deep Truths About R

So, what are the implications of these statements?

# I’m ‘not in’ right now…

Checking whether an item is in a vector or not in a vector is a common task. The notation in R is a little inelegant when expressing the “not in” condition since the negation operator (`!`) is separated from the comparison operator (`%in%`):

```5 %in% c(1, 2, 3, 4, 5)  # TRUE
!5 %in% c(1, 2, 3, 4, 5) # FALSE```

R is a language where you can easily extend the set of built in operators:

````%!in%` <-
function(needle, haystack) {
!(needle %in% haystack)
}```

Now, I can express my intentions reasonably clearly with my new, compact, infix operator `%!in%`:

```5 %in% c(1, 2, 3, 4, 5)  # TRUE
5 %!in% c(1, 2, 3, 4, 5) # FALSE```

Moral: bend your tools to your will, not the other way ’round.

# Defensively install packages in R

Often, your R code will rely on having one or more R packages available. A little defensive coding will save users of your code—including future-you—from having to figure out which packages you’re using and then having to manually install them. This lowers the extraneous cognitive load associated with running older or unfamiliar code.

`if (!"tidyverse" %in% installed.packages()) install.package("tidyverse", deps = TRUE)`

Or, if you prefer to always use blocks with IF statements:

```if (!"tidyverse" %in% installed.packages()) {
install.package("tidyverse", deps = TRUE)
}```

With a little persistence, you can extend this to dealing with multiple packages:

```pkgs <- c("tidyverse", "openxlsx")
install.packages(pkgs[!pkgs %in% installed.packages()], deps = TRUE)
```

# Getting started with R

R provides the backend: the programming language specification and the interpreter.

RStudio provides the frontend: the user interface that allows you to interact with R, visualize data, and manage the files associated with your analyses.

R for Data Science introduces you to the tidyverse way of programming. There are basically methods of programming in R: “base R”, which has been around since the R language was first conceived (and before, since R is itself based on the S language), and the tidyverse, a newer approach that focuses on leveraging a consistent structure to your data and developing a grammar for data ingest, data wrangling, data visualization, and data storage.

Base R tends to be dense in meaning where the Tidyverse tends to be consistent and to breakdown complex processes into a set of discrete steps:

 base R Tidyverse `mtcars[2, "cyl"]` ```library(tidyverse) mtcars %>% select(cyl) %>% slice(2)``` `mtcars[mtcars\$cyl == 4, c("hp", "mpg")]` ```library(tidyverse) mtcars %>% filter(cyl == 4) %>% select(hp, mpg)```

# FizzBuzz in R

Functions are first class objects in R. Functions establish closures also known in R as environments. So, you can use functions to create other functions in creative ways.

Here, I’ve written a function called `divisor` that returns a function that checks whether a given input, `d`, is evenly divisible by `number` and if so, returns `string`. Then I use `divisor` to create a test for divisibility by 3 and another for divisibility by 5.

Problem: Given a range of positive, non-zero integers, output “Fizz” if the number is evenly divisible by 3, output “Buzz” if the number is evenly divisible by 5, and output “FizzBuzz” if the number is evenly divisible by both 3 and 5; otherwise, output the number.

Solution:

```divisor <-
function(number, string) {
function(d) {
if (d %% number == 0) string else ""
}
}

mod3er <- divisor(3, "Fizz")
mod5er <- divisor(5, "Buzz")

fizzbuzz <-
function(i) {
res <- paste0(mod3er(i), mod5er(i))
ifelse(res == "", i, res)
}

sapply(1:100, fizzbuzz)
```

# R Statistical Programming Language

The R Project provides a comprehensive, free, open source statistical programming language and environment based on the S language. R is the name of both the language and the environment in which you generally use the language. It’s an interactive environment where the commands you enter generate immediate results that you can use to guide your analyses.