Python 3 Quick Code: Get a Random Item from a List with an Even Chance of Being Chosen

šŸ“… June 12, 2015
py001In Python 3, suppose we want to get a random item from a list that contains duplicates, but we do not want to favor the duplicates.

For example, if there are 3 duplicate items in a list of 10, then there is a 3x probability of obtaining that particular item compared to a single occurrence of that item in the list.

The idea is that by eliminating the duplicates, all items should have an equal chance of being chosen. Here is one strategy to achieve this using a set.

The Concept

In programming, a set is an unordered data structure containing unique items.

Unordered means that the items do not appear in order. They could appear in any order when assigning the contents of a list to a set as we will see. If you wish to sort the items in a set, then sortingĀ must be performed separately. Sorting is not automatic with a set because indexing is not recorded like it is with a list. However, we want a random item, so sorting is unnecessary here.



Beginning with a list that contains duplicates, we can follow theseĀ steps:

  1. Convert the list to a set so all items will be unique.
  2. Use the random.sample() function to get a random item from the set.


Let’s look at some Python 3 code:

#!/usr/bin/env python3
import random
wordlist = ['apple', 'toy', 'apple', 'coin', 'box', 'visible', 'graveyard', 'apple', 'cemetery', 'ort', 'ort', 'phatic', 'floccinaucinilihilipilification', 'omphalos', 'aeolian', 'floccinaucinilihilipilification', 'pasquinade', 'complaisant', 'floccinaucinilihilipilification']
wordset = set(wordlist)
print(random.sample(wordset, 1))

How It Works

In this example, we use a list of words, but we could read a wordfile or have a long list of numbers. However, lists may contain duplicates, and duplicates certainly exist here.

wordlist = ['apple', 'toy', 'apple', 'coin', 'box', 'visible', 'graveyard', 'apple', 'cemetery', 'ort', 'ort', 'phatic', 'floccinaucinilihilipilification', 'omphalos', 'aeolian', 'floccinaucinilihilipilification', 'pasquinade', 'complaisant', 'floccinaucinilihilipilification', 'floccinaucinilihilipilification']

We want each word to have an even chance of being chosen, but duplicate words give the duplicates an added advantage. ‘apple‘ appears three times, ‘ort‘ appears two times, and ‘floccinaucinilihilipilification‘ (Yes! A chance to use this word!Ā ā˜ŗ) appears 4Ā times out of 20.

The point is that if we choose a word at random from the list now, floccinaucinilihilipilification has the greatest chance of being chosen 4 out of 20 times (a 1-in-5 probability). We want to eliminate that unfair advantage over other words that appear once, such as aeolian and phatic, and give all words an even 1-in-20 probability.

The trick is to create a set from the list. Doing this automatically ignoresĀ duplicates, so we are left with a set containing unique items. We use the Python 3 built-in function set() that takes a list as an argument and returns a set.

wordset = set(wordlist)


Resulting Set:

{'graveyard', 'floccinaucinilihilipilification', 'omphalos', 'apple', 'toy', 'box', 'visible', 'cemetery', 'pasquinade', 'phatic', 'complaisant', 'aeolian', 'coin', 'ort'}

Note: Items might not be added to the set in the same order each time. If we print the items of the set immediately after the assignment, we can see this when we run the program multiple times.

wordset = set(wordlist)

Run 1:
{'box', 'phatic', 'floccinaucinilihilipilification', 'visible', 'omphalos', 'complaisant', 'apple', 'coin', 'toy', 'cemetery', 'pasquinade', 'graveyard', 'aeolian', 'ort'}
Run 2:
{'cemetery', 'box', 'phatic', 'aeolian', 'omphalos', 'visible', 'pasquinade', 'complaisant', 'apple', 'ort', 'toy', 'coin', 'graveyard', 'floccinaucinilihilipilification'}
Run 3:
{'pasquinade', 'box', 'visible', 'graveyard', 'floccinaucinilihilipilification', 'omphalos', 'coin', 'ort', 'apple', 'toy', 'phatic', 'aeolian', 'complaisant', 'cemetery'}

While we could pop the first item off the set to get our random word and be done, let’s use the sample() function to be certain that we are truly gettingĀ a random item from the set no matter the platform the code runs on.

print(random.sample(wordset, 1))

For lists, a simple way to get a random item is to shuffle the list and get the first item.


However, this does not work for a set since a set does not support indexing. This means we cannot use slicing or something like wordset[0] to get the first item from the set.

TypeError: 'set' object does not support indexing

But all is not lost! Sets are not the same as lists, so we handle sets differently. To get a random item from a set, we use the random.sample() function, which returns one or more items from a set. (You can choose how many returned items you want).

print(random.sample(wordset, 1))

In random.sample(wordset, 1), wordset is the set, and the ‘1‘ says, “Get only one item from this set.” We could use 2 to get two random items, 3 for three random items, and so on.

The idea is that since each item appears only once in the set, each word has an even chance of being chosen.
There is more than one way to obtain an identical result. For example, we could skip the separate set assignment:

print(random.sample(set(wordlist), 1))

As expected, there are many ways to achieve the same result. Hopefully, this quick code snippet stimulates your imagination so you can think of your own improvements should the need arise.

Have fun!


  1. Leave a comment

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: