Scrambling the letters of a message with R

One of the most powerful aspects of R is that it has a diverse set of random number generators. We can use these R tools to create methods of obscuring a message in what appears to be meaningless strings of text (cryptography). In this tutorial, I am going to outline how we can perform basic cryptography by simply randomly scrambling the letters of a message and adding some dummy random characters. This is made possible by the set.seed() function that allows our random number generation to happen repeatedly as long as we specify that same seed every time we run the code. Throughout this tutorial, I am going to start by illustrating the basic idea and work toward having simple to use functions that handle all of the work. So let’s start with a message, which is the arbitrary “This is a test!” message. Then we use strsplit() to split the message into a vector where each character has its own element. Now, I set the seed for scrambling as 123, and create the scrambled indexes. Finally, I change the indexes in the split message and paste them back together undoing what strsplit() did. If the code in this tutorial is difficult to read, you can check out of the code I used on GitHub here: https://github.com/statswithrdotcom/Message-Scrambler-Tutorial-Code.

Message = "This is a test!" # The message

Mess_split = strsplit(Message, split = "")[[1]] # Vectorizes the message

set.seed(123) # The seed for the random numbers

Scrambed_Ind = sample(1:length(Mess_split),length(Mess_split),replace = FALSE) # Random numbers

Mess_Scrammed = Mess_split[Scrambed_Ind] # The random numbers become new indexes

Mess_Scrammed = paste(Mess_Scrammed, collapse="") # The message is now a single element again

print(Mess_Scrammed) # The scrambled message

I have only done half of the process, now I need a way to reliably undo the message scrambling. You will notice that I can only unscramble the message if I specify the correct seed for the random number generator! The structure is similar to what we did before, but now we need to reverse the process to return the original message. The for loop reverses the process by taking each element of the scrambled message and assigning it to the appropriate index of the empty Mess_Unscrammed vector.

Message_Scrammed = strsplit(Mess_Scrammed, split = "")[[1]]

set.seed(123)

Scrambed_Ind = sample(1:length(Message_Scrammed),length(Message_Scrammed),replace = FALSE)

Mess_Unscrammed = rep(0,length(Message_Scrammed))

counter = 0

for(i in Scrambed_Ind){

counter = counter + 1

Mess_Unscrammed[i] <- Message_Scrammed[counter]

}

paste(Mess_Unscrammed, collapse = "")

Now to improve the code, I am going to wrap it in functions so I can make more complex and concise code. The first function is “Scrambler” and of course, it scrambles the message given using the “Seed” argument. Additionally, we now have the analogous “Unscrambler” that takes the scrambled message and the seed used to scramble the original message.

Scrambler <- function(Message, Seed = 123){

Mess_split = strsplit(Message, split = "")[[1]]

set.seed(Seed)

Scrambed_Ind = sample(1:length(Mess_split),length(Mess_split),replace = FALSE)

Mess_Scrammed = paste(Mess_split[Scrambed_Ind], collapse="")

return(Mess_Scrammed)

}

Unscrambler <- function(Scrambled, Seed = 123){

Message_Scrammed = strsplit(Scrambled, split = "")[[1]]

set.seed(Seed)

Scrambed_Ind = sample(1:length(Message_Scrammed),length(Message_Scrammed),replace = FALSE)

Mess_Unscrammed = rep(0,length(Message_Scrammed))

counter = 0

for(i in Scrambed_Ind){

counter = counter + 1

Mess_Unscrammed[i] <- Message_Scrammed[counter]

}

return(paste(Mess_Unscrammed, collapse = ""))

}

#Tests

Info_Scrambled <- Scrambler("Cryptography is Fun!", Seed = 123)

print(Info_Scrambled)

Info_Unscrambled <- Unscrambler(Info_Scrambled, Seed = 123)

print(Info_Unscrambled)

It is time to add some complexity, after all, a scrambled message is not that hard to decipher if it is short enough. To do this I am going to create a helper function that simply creates random upper-case letters, lower-case letters, and numbers. Additionally, when the message is unscrambled I want these letters removed automatically, so I am going to tack a series of vowels onto the beginning to indicate where the random letters start. Now we simply need to provide how many letters we want, which will actually give us n+1.

rand_letters_nums <- function(n){

Choices = c(letters,LETTERS,0:9)

Ind = sample(1:length(Choices),n+1,replace = TRUE)

Output = Choices[Ind]

Indicator = c("A","E","I","O","U")

return(c(Indicator,Output))

}

Rand_Stuff <- rand_letters_nums(0)

print(Rand_Stuff)

Now let’s go ahead and incorporate these random letters into our functions to make the messages much harder to decipher. In the Scrambler function, we add the random letters function to the Mess_split variable, and allow the random letters to be scrambled with the real message. The Unscrambler function is a little more complex, but not too much. At the end of the function, the message is strsplit at the point the indicator “AEIOU” is present, thereby removing the indicator and all of the random characters that appear after it. Note that this means the “AEIOU” in that exact form can not be in the original message, so if this is possible in your use case you will want to modify your functions’ indicators.

Scrambler <- function(Message, Seed = 123, Noise = 100, Indicator = TRUE){

Mess_split = strsplit(Message, split = "")[[1]]

Mess_split = c(Mess_split, rand_letters_nums(Noise))

set.seed(Seed)

Scrambed_Ind = sample(1:length(Mess_split),length(Mess_split),replace = FALSE)

Mess_Scrammed = paste(Mess_split[Scrambed_Ind], collapse="")

return(Mess_Scrammed)

}

Unscrambler <- function(Scrambled, Seed = 123){

Message_Scrammed = strsplit(Scrambled, split = "")[[1]]

set.seed(Seed)

Scrambed_Ind = sample(1:length(Message_Scrammed),length(Message_Scrammed),replace = FALSE)

Mess_Unscrammed = rep(0,length(Message_Scrammed))

counter = 0

for(i in Scrambed_Ind){

counter = counter + 1

Mess_Unscrammed[i] <- Message_Scrammed[counter]

}

Tentative_Output = paste(Mess_Unscrammed, collapse = "")

Output = strsplit(Tentative_Output, split = paste(c("A","E","I","O","U"), collapse = ""))[[1]]

return(Output[1])

}

#Test

Info2 = Scrambler("This is better!", Seed = 123)

print(Info2)

Info2_Unscram = Unscrambler(Info2, Seed = 123)

print(Info2_Unscram)

Now we have only one problem, seeds are easy to brute force, one could write a loop that tries every seed number and look at which number does not produce gibberish. To fix this, we are going to require a real password that is much harder to brute force with essentially infinitely many complex possible options. To start, I am going to introduce a helper function I created in my neurocrpyography book that transforms a password into a series of numbers that correspond to the index of that character in the “Pos” variable embedded in the Pass_Index function. Now, in the Password_Protect function, the message is scrambled repeatedly with the seeds being the specific indexes generated by the helper function. So each character in the password including spaces, causes the message to be scrambled once. In the Password_Remove function, the opposite occurs, with the indexes of the password being reversed with the rev() function to unscramble the message. We now have a system that is complex enough that it could be called cryptography. Note that this is for informational use only, and I am not suggesting that this is a method suitable for protecting sensitive information of any kind.

Pass_Index <- function(Input){

Input <- strsplit(Input,split = "")[[1]]

Pos <- as.character(c(""," ", letters,LETTERS,"?",".","!",0:9,"'",

"/","+","-","<",">","@","$","%","#","^","&","*","(",")","_","=",","))

for(i in 1:length(Input)){

Counter = 0

for(j in 1:length(Pos)){

if(Counter == 0){

if(Input[i] == Pos[j]){

Input[i] = j; Counter = 1

}

}

}

}

return(Input)

}

Password_Protect <- function(Message, Password = "Crypto", Noise = 10){

Pass_seq = Pass_Index(Password)

for(i in Pass_seq){

Message = Scrambler(Message, Seed = i, Noise = Noise)

}

return(Message)

}

Password_Remove <- function(Scrambled, Password = "Crypto"){

Pass_seq = rev(Pass_Index(Password))

for(i in Pass_seq){

Scrambled = Unscrambler(Scrambled, Seed = i)

}

return(Scrambled)

}

#Test

Info3 = Password_Protect("This is better!", Password = "Test")

print(Info3)

Info3_Unscram = Password_Remove(Info2, Password = "Test")

print(Info3_Unscram)

If you would like to see how this all works in an R shiny application, you can find it here: https://www.statswithr.com/scrambler-app.

Previous
Previous

Regression by Sampling

Next
Next

Writing while loops in R