Home > other >  Learning to doc2vec error: AttributeError: 'STR' object has no attribute 'words'
Learning to doc2vec error: AttributeError: 'STR' object has no attribute 'words'

Time:10-23

Novice a, studies the two days is really don't know how to solve, consult everybody a great god!

I am using Python2.7.10, gensim version is 0.12.2, currently in learning doc2vec model,
When I was at the official tutorial (http://radimrehurek.com/gensim/models/doc2vec.html), not from the start, tutorial said:

Make sure you have a C compiler before installing gensim, to use optimized (compiled) doc2vec training (70 x speedup/blog).

Initialize a model with e.g. :

> The model=Doc2Vec (documents, size=100, window=8, min_count=5, workers=4)
Persist a model to disk with:

> Model. The save (fname)
> The model=Doc2Vec. Load (fname) # you can continue training with the the loaded model!
The model can also be instantiated from an existing file on disk in The word2vec format C:

> Model=Doc2Vec. Load_word2vec_format ('/TMP/vectors. TXT, binary=False) # C text format
> Model=Doc2Vec. Load_word2vec_format ('/TMP/vectors. Bin, binary=True) # C binary format


My code is as follows, including negForPy. TXT or so long:
 
Poor standard room is not as good as 3 stars and facilities are very old. Suggest hotel standard room from the new improvement of the old.
Service attitude is very poor, the receptionist doesn't seem to be trained, don't even understand basic manners, unexpectedly a few guest at the same time; Assistant manager even worse, with the guest argued endlessly, to the general manager's telephone complaints should all dare not to, without anything out, and this need not be afraid, so
Geographic position is good, where are more convenient, but the service doesn't like Howard Johnson group management, poor, and sleep in the afternoon to wash a bath, wanted to let the hotel to cleaning, so opened, please clean the service lamp, but back to the hotel for the night, found a clean service lamp is turned off, and the room is not cleaned,
1, I live on the standard of the road, room facilities, humble, and the room window outdoor and a layer of glass curtain wall, and cannot be opened, in the room can't natural ventilation, daylighting is bad, 2, ate three meal, little varieties and 3, the restaurant on the second floor is rented out, expensive price, the original order of the tenant can offer a ninety percent discount (in the room service guide also clearly written, but wait to checkout of seafood and wine is not to be discounted, and there is no invoice, it is not easy to find good to manager to get the invoice, the next day in aggregate with four-star gap is too big!
1, I live on the standard of the road, room facilities, humble, and the room window outdoor and a layer of glass curtain wall, and cannot be opened, in the room can't natural ventilation, daylighting is bad, 2, ate three meal, little varieties and 3, the restaurant on the second floor is rented out, expensive price, the original order of the tenant can offer a ninety percent discount (in the room service guide also clearly written, but wait to checkout of seafood and wine is not to be discounted, and there is no invoice, it is not easy to find good to manager to get the invoice, the next day in aggregate with four-star gap is too big!


 
# encoding: utf-8
The from gensim. Models. Doc2vec import doc2vec

With the open (' negForPy. TXT ', 'r') as infile:
The documents=infile. Readlines ()
The model=Doc2Vec (documents, size=100, window=8, min_count=5, workers=4)

The model=Doc2Vec. Load_word2vec_format (' vectors1. TXT, binary=False)


Line 8 and before the code is different two methods, I have a separate run no matter what method will be an error, like the above code run directly, for example, an error is:
Traceback (the most recent call last) :
The File "E: \ zzWorkFiles \ ZZworkspace \ Practise1 \ SRC \ Prac1 \ draft2 py", line 6, the in & lt; module>
Model=Doc2Vec ([" aaaa ", "andajfiaihe", "dfghiah", "adoifjeng]", size=100, window=8, min_count=5, workers=4)
The File "D: \ programFiles \ Python2.7.10 \ lib \ site - packages \ gensim - 0.12.2 - py2.7 - win32. Egg \ gensim \ models \ doc2vec py", line 584, in __init__
Self. Build_vocab (documents, trim_rule=trim_rule)
The File "D: \ programFiles \ Python2.7.10 \ lib \ site - packages \ gensim - 0.12.2 - py2.7 - win32. Egg \ gensim \ models \ word2vec py", line 495, in build_vocab
Self. Scan_vocab (sentences, trim_rule=trim_rule) # initial survey
The File "D: \ programFiles \ Python2.7.10 \ lib \ site - packages \ gensim - 0.12.2 - py2.7 - win32. Egg \ gensim \ models \ doc2vec py", line 627, in scan_vocab
Document_length=len (document. Words)
AttributeError: 'STR' object has no attribute 'words'


In this case how should handle??
Great god answer genuflect is begged!

CodePudding user response:

Gensim. Models. Doc2vec. TaggedLineDocument (file) to deal with the input data, and then plug in the model

CodePudding user response:

reference 1st floor lala_001 response:
gensim. Models. Doc2vec. TaggedLineDocument (file) to deal with the input data, then substitution model


Hello!
According to my understanding of your prompt, change the code to this:
 
The import gensim
The from gensim. Models. Doc2vec import doc2vec

With the open (' negForPy. TXT ', 'r') as infile:
The documents=infile. Readlines ()
Gensim. Models. Doc2vec. TaggedLineDocument (documents)
The model=Doc2Vec (documents, size=100, window=8, min_count=5, workers=4)


But still an error:
Traceback (the most recent call last) :
The File "E: \ zzWorkFiles \ ZZworkspace \ Practise1 \ SRC \ Prac1 \ draft2 py", line 31, in & lt; module>
The model=Doc2Vec (documents, size=100, window=8, min_count=5, workers=4)
The File "D: \ programFiles \ Python2.7.10 \ lib \ site - packages \ gensim - 0.12.2 - py2.7 - win32. Egg \ gensim \ models \ doc2vec py", line 584, in __init__
Self. Build_vocab (documents, trim_rule=trim_rule)
The File "D: \ programFiles \ Python2.7.10 \ lib \ site - packages \ gensim - 0.12.2 - py2.7 - win32. Egg \ gensim \ models \ word2vec py", line 495, in build_vocab
Self. Scan_vocab (sentences, trim_rule=trim_rule) # initial survey
The File "D: \ programFiles \ Python2.7.10 \ lib \ site - packages \ gensim - 0.12.2 - py2.7 - win32. Egg \ gensim \ models \ doc2vec py", line 627, in scan_vocab
Document_length=len (document. Words)
AttributeError: 'STR' object has no attribute 'words'


Can say more about it?

CodePudding user response:

The building Lord, solved? To beg? How to solve the

CodePudding user response:

Does not address...
I have seen in other places some, Python, and various kinds of library version of the problem, by reducing version can be solved, and then not then...

CodePudding user response:

How do you solve this problem, I also encountered similar problems

CodePudding user response:

nullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnull
  • Related