How To Shuffle A Bash Array

Is there any way to shuffle the elements in a Bash array?

Absolutely, but you must write your own routine. Unlike other programming languages, such as Python, that contain dedicated shuffle methods, Bash does not have a built-in shuffle function of its own.

Here is one example of a custom shuffle script.

#!/bin/bash
# -------------------------------------------
# Shuffle array
# Contents of array src are shuffled randomly
# and result is stored in destination array
# -------------------------------------------
IFS=$'\n'
src=(red green blue yellow orange black 'hello world')
dest=()
# Show original array
echo -e "\xe2\x99\xa6Original array\xe2\x99\xa6"
echo "${src[@]}"
# Function to check if item already exists in array
function checkArray
{
 for item in ${dest[@]}
 do
 [[ "$item" == "$1" ]] && return 0 # Exists in dest
 done
 return 1 # Not found
}
# Main loop
while [ "${#dest[@]}" -ne "${#src[@]}" ]
do
 rand=$[ $RANDOM % ${#src[@]} ] 
 checkArray "${src[$rand]}" || dest=(${dest[@]} "${src[$rand]}")
done
# Show result
echo -e "\n\xe2\x99\xa6Shuffled Array\xe2\x99\xa6"
echo "${dest[@]}"

How It Works

The idea is to pick an element at random from the source array src and append it to the destination array dest. We start with an empty dest array and loop until it contains one of each element from the src array. Since each element is chosen at random, the order of elements in dest will be random as well. This simulates an array shuffle.

Notes

IFS=$'\n'

This is essential in order to handle elements that contain spaces, such as ‘Hello world!’ We want Hello World to be treated as one element, not two.

# Function to check if item already exists in array
function checkArray
{
 for item in ${dest[@]}
 do
 [[ "$item" == "$1" ]] && return 0 # Exists in dest
 done
 return 1 # Not found
}

This function is critical because it prevents duplicate elements. When an element is chosen at random, the same one might be chosen twice. We do not want to add it again to the dest array, so we check to see if it is already present in dest. If not, the new element is added to the array. If so, the the element is skipped and another one is chosen at random.

while [ ${#dest[@]} -ne ${#src[@]} ]

At first, src and dest will be of different lengths. The main loop will repeat until the sizes of both arrays match. The # character used in ${#dest[@]} returns the length of the array. When the two array lengths match, then all elements have been added and the loop quits since there is nothing more to add.

${dest[@]}

Using the @ character tells Bash to return all elements of the array as separate elements on a single line. Try not to confuse the function of @ with the * character as in ${dest[*]}. Both appear to produce the same output in a terminal, but there is a difference in how the elements are expanded when the array is contained within double quotes. The @ character expands each element to a separate word, which is what we want.

Even though ${#dest[*]} -ne ${#src[*]} will run the same way, logically what we want is ${#dest[@]} -ne ${#src[@]}. Whichever you use depends upon how you prefer to think about how the script runs behind the scenes.

\xe2\x99\xa6 - What Is \xe2?

This embellishes the output by inserted Unicode diamond symbols. We must use the UTF-8 sequence, not the UTF-16 sequence, which is slightly different. UTF-8 must be represented as a sequence of 8-bit bytes with each byte prepended by the \x hexadecimal escape sequence. The -e option tells echo to interpret escape sequences. Without -e, \xe2 is printed literally.

'hello world'

Since the space character separates elements, we must use single quotes to delimit elements containing whitespace.


rand=$[ $RANDOM % ${#src[@]} ]

This grabs a valid index that is then used to access an element of the array. The modulus operator ensures that we stay within the array limits. $RANDOM is a Bash variable that returns a positive random number.

checkArray “${src[$rand]}” || dest=(${dest[@]} “${src[$rand]}”)

This is the heart of the script that accepts or rejects the random word. Function checkArray is called to find out if the word already exists in dest. The OR (||) is a shortcut that eliminates the need for an if statement and places two commands on the same line.

Function checkArray returns either a 0 (already exists) or a 1 (does not exist). If checkArray returns 1, meaning the chosen word does not already exist in dest, then the following command after the || is executed and the word is appended to dest. If checkArray returns 0, meaning the word already exists, then the second command is skipped and the loop tries again by generating a new random index.

, , , ,

  1. Leave a comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: