Instructions: random_text_files (PDF)
Try to solve it for yourself before peeking at the hints!
Here is one possible solution hint with comments for clarity to show how this can be achieved.
#!/usr/bin/env python3 # Generates a random number of text files consisting of random nonsense text. # Copies the real file to blend in among them. import os import shutil import random import re # Get a filename from the wordlist # ------------------------------------------------------ def getFilename(): name1 = random.sample(wordlist, 1).lower() name1 = re.sub('[^a-zA-Z0-9]', '', name1) name2 = random.sample(wordlist, 1).lower() name2 = re.sub('[^a-zA-Z0-9]', '', name2) return name1 + '_' + name2 + '.txt' # ------------------------------------------------------ # The fake directory. fakeDir = 'cookierecipes' # The real file. Using a cookie recipe theme. realFile = 'cookie_recipe.txt' # Using an absolute path to the Linux dictionary. # Might need to be changed for other systems. wordlistFile = '/usr/share/dict/words' # Get number of words (one per line) in wordlist. # Ignores blank lines and uses a list. wordlist =  try: with open(wordlistFile) as f: for line in f: line = line.strip() if not line == '': wordlist.append(line) except: print('Problem with ' + wordlistFile) quit(1) numWords = len(wordlist) # How many words in wordlist? # Start fresh. Remove existing fakeDir if it already exists. if os.path.isdir(fakeDir): shutil.rmtree(fakeDir) os.makedirs(fakeDir) # How many files to generate? numFiles = random.randrange(5, 26) punctList = ['.', '!', '?', '.'] # What punctuation is available? ('.' twice as likely) # Generate text by paragraphs, sentences, and words for n in range(1, numFiles + 1): # Get unique filename. Keep trying until a unique name is found. while True: filename = getFilename() fpath = fakeDir + os.path.sep + filename if not os.path.exists(fpath): break # Open file for writing try: f = open(fpath, 'w') except: print('Problem with ' + fpath) quit(1) print(fpath) # Number of paragraphs for p in range(1, random.randrange(4, 14)): # Generate random number of sentences for the file for s in range(1, random.randrange(3, 10)): # Get random words from wordlist numWords = random.randrange(1, 101) random.shuffle(wordlist) sentence = '' for word in range(3, numWords): sentence = sentence + wordlist[word] if not word == numWords - 1: sentence += ' ' else: punct = random.sample(punctList, 1) sentence += punct # Random punctuation f.write(sentence.capitalize() + '\n\n') # Write this sentence to the file f.close() # Copy the real file to the destination directory try: shutil.copy2(realFile, fakeDir + os.path.sep + realFile) except: print('Problem with ' + realFile)
Can this code be improved? Absolutely. This was written to illustrate the logic behind the steps yet still generate a satisfactory result. One area that can be tweaked is a check for large files that exceed the maximum. No check exists. Also, the program runs slowly, so it desperately needs optimization.
realFile is a real text file containing a real message surrounded with bogus text to make the file size large enough to blend in with the others.
In this example, the program creates a subdirectory named cookierecipes. A random number of files should appear.
Uh, oh! Notice four files are 0 bytes? That’s -4 points from the total score. The code needs refinement to prevent this from happening.
Example Text File Contents
Each random file should have nonsense text that looks like this:
Each word is separated by a space. In addition, each sentence is capitalized and ends with a random punctuation character, so this earns the bonus points.
With these hints in mind, try to improve your code to make the process faster and error-free. Have fun!