Programming Challenge: Random Text Files

📅 July 14, 2015
random_text_filesCaptain Crunchneck needs your help! Can you write a Python program that generates a random number of text files containing random text?

Instructions: random_text_files (PDF)

Try to solve it for yourself before peeking at the hints!


Here is one possible solution hint with comments for clarity to show how this can be achieved.


#!/usr/bin/env python3

# Generates a random number of text files consisting of random nonsense text.
# Copies the real file to blend in among them.

import os
import shutil
import random
import re

# Get a filename from the wordlist
# ------------------------------------------------------
def getFilename():
    name1 = random.sample(wordlist, 1)[0].lower()
    name1 = re.sub('[^a-zA-Z0-9]', '', name1)

    name2 = random.sample(wordlist, 1)[0].lower()
    name2 = re.sub('[^a-zA-Z0-9]', '', name2)

    return name1 + '_' + name2 + '.txt'
# ------------------------------------------------------

# The fake directory.
fakeDir = 'cookierecipes'

# The real file. Using a cookie recipe theme.
realFile = 'cookie_recipe.txt'

# Using an absolute path to the Linux dictionary.
# Might need to be changed for other systems.
wordlistFile = '/usr/share/dict/words'

# Get number of words (one per line) in wordlist.
# Ignores blank lines and uses a list.
wordlist = []

    with open(wordlistFile) as f:
        for line in f:
            line = line.strip()
            if not line == '':
    print('Problem with ' + wordlistFile)

numWords = len(wordlist) # How many words in wordlist?

# Start fresh. Remove existing fakeDir if it already exists.
if os.path.isdir(fakeDir):


# How many files to generate?
numFiles = random.randrange(5, 26)
punctList = ['.', '!', '?', '.'] # What punctuation is available? ('.' twice as likely)

# Generate text by paragraphs, sentences, and words
for n in range(1, numFiles + 1):

    # Get unique filename. Keep trying until a unique name is found.
    while True:
        filename = getFilename()
        fpath = fakeDir + os.path.sep + filename
        if not os.path.exists(fpath):

    # Open file for writing
        f = open(fpath, 'w')
        print('Problem with ' + fpath)


    # Number of paragraphs
    for p in range(1, random.randrange(4, 14)):

        # Generate random number of sentences for the file
        for s in range(1, random.randrange(3, 10)):

            # Get random words from wordlist
            numWords = random.randrange(1, 101)    
            sentence = ''
            for word in range(3, numWords):
                sentence = sentence + wordlist[word]
                if not word == numWords - 1:
                    sentence += ' '
                    punct = random.sample(punctList, 1)
                    sentence += punct[0] # Random punctuation

            f.write(sentence.capitalize() + '\n\n') # Write this sentence to the file

# Copy the real file to the destination directory
    shutil.copy2(realFile, fakeDir + os.path.sep + realFile)
    print('Problem with ' + realFile)

Can this code be improved? Absolutely. This was written to illustrate the logic behind the steps yet still generate a satisfactory result. One area that can be tweaked is a check for large files that exceed the maximum. No check exists. Also, the program runs slowly, so it desperately needs optimization.

realFile is a real text file containing a real message surrounded with bogus text to make the file size large enough to blend in with the others.


In this example, the program creates a subdirectory named cookierecipes. A random number of files should appear.

A list of random text files generated.

A list of random text files generated.

Uh, oh! Notice four files are 0 bytes? That’s -4 points from the total score. The code needs refinement to prevent this from happening.

Example Text File Contents

Each random file should have nonsense text that looks like this:

One of many  text files containing text chosen at random from a wordlist or dictionary file.

One of many text files containing text chosen at random from a wordlist or dictionary file.

Each word is separated by a space. In addition, each sentence is capitalized and ends with a random punctuation character, so this earns the bonus points.

With these hints in mind, try to improve your code to make the process faster and error-free. Have fun!


  1. Leave a comment

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: