Home > Net >  trying to generate a vector
trying to generate a vector

Time:02-05

I'm trying to generate vector from wikipediaText. But when i running my def generateVector i have an error message like: list indices must be integers or slices, not str on line word2idx[word] = idx. I'll be very thankful if somebody help me with my task. Here my code:

def getVocab(inputString):
    inputString = inputString.lower()
    inputString = inputString.replace("."," ")
    parsed = inputString.split()
    vocab = set(parsed)
    return vocab, parsed 

wikipediaText = 'Python is an interpreted, high-level, general-purpose programming language.   Created by Guido van Rossum and first released in 1991, Python`s design philosophy emphasizes code  readability with its notable use of significant whitespace.'

vocab, _= getVocab(wikipediaText)
print(vocab)


def generateVector(inputString):
    vocab, parsed = getVocab(inputString)
    word2idx = []
    for idx, word in enumerate(vocab):
        word2idx[word] = idx
    
    vector = []
    for word in len(range(parsed)):
        vector.append(word2idx[word])

    return vector

print(generateVector(wikipediaText))

CodePudding user response:

You define word2idx as a list (word2idx = []) and then trying to address it as a dictionary. Than you have another mistake in the next for cycle. The corrected code below works, but I'm not sure if the result is what you expected

def getVocab(inputString):
    inputString = inputString.lower()
    inputString = inputString.replace("."," ")
    parsed = inputString.split()
    vocab = set(parsed)
    return vocab, parsed 

wikipediaText = 'Python is an interpreted, high-level, general-purpose programming language.   Created by Guido van Rossum and first released in 1991, Python`s design philosophy emphasizes code  readability with its notable use of significant whitespace.'

vocab, _= getVocab(wikipediaText)
print(vocab)


def generateVector(inputString):
    vocab, parsed = getVocab(inputString)
    word2idx = {}
    for idx, word in enumerate(vocab):
        word2idx[word] = idx
    
    vector = []
    for word in parsed:
        vector.append(word2idx[word])

    return vector

print(generateVector(wikipediaText))

CodePudding user response:

I think you are trying to create a dictionary, but you are trying to use a list instead. Try this:

def generateVector(inputString):
    vocab, parsed = getVocab(inputString)
    word2idx = {}
    for idx, word in enumerate(vocab):
        word2idx[word] = idx

    vector = []
    for word in parsed:
        vector.append(word2idx[word])

    return vector
  • Related