Home > Back-end >  How can i print the amount of unique words in this doc.txt file / why doesn't it work?
How can i print the amount of unique words in this doc.txt file / why doesn't it work?

Time:04-02

  • Problem: Need to output the amount of unique words within this doc.txt file which should be 16 (no dupes), and i can't seem to find why it doesn't do so.

  • Note: The black lines are just my user which includes a space e.g. Will Smith & i don't expect full help on this project i just can't seem to find a solution to this issue, and this is a last resort, apologies.

Code Image:

What I've tried:

  • I've tried changing the file path, the function parameter name, methods to call the function, including "/" where the spaces are.

CodePudding user response:

A set would make things easier. For example:

with open('file.txt') as infile:
  contents = infile.read()
  myset = set(contents.split())
  print(len(myset)) # the number of unique words

CodePudding user response:

import string

def dictionary_1(path):
    dictionary = {}
    read_file = None

    # properly read file
    with open(path, "r") as text_file:
        read_file = text_file.read()
    
    # remove punctuation marks
    read_file = read_file.translate(str.maketrans('','', string.punctuation))

    dictionary = {word:[] for word in read_file.split()}
    # for word in unique_words:
    #     dictionary.setdefault(word, [])

    print("Words in dictionary: ", len(dictionary))
    print(dictionary)

And always make sure not to use python keywords(dict, list, ...) as variable name

  • Related