The “extract” Function in R

Mar 5

Purpose: To extract substrings from a character vector based on a regular expression pattern.

remove: Whether to remove the original column after extraction. The default is TRUE.

Example (with Explanation):
# Load necessary packages
library(tidyr)

# Create a sample data frame
data <- data.frame(
ID = 1:3,
Name = c("John Doe", "Jane Smith", "Bob Johnson")
)

# Extract the first and last names from the 'Name' column
# The tidyr package is explicitly given because my system used...
# the extract function from the wrong package by default.
result <- tidyr::extract(data, col = Name, into = c("First_Name", "Last_Name"), regex = "(\\w+) (\\w+)")

# Display the result
print(result)
In this example, the extract function from the tidyr package is used to extract the first and last names from the ‘Name’ column in the sample data frame data. The regular expression pattern "(\\w+) (\\w+)" is used to capture two groups of word characters representing the first and last names. The extracted substrings are stored in new columns ‘First_Name’ and ‘Last_Name’. The result is a new data frame result containing the extracted first and last names.

The “fill” Function in R