I am using Python2.7.10, gensim version is 0.12.2, currently in learning doc2vec model,
When I was at the official tutorial (http://radimrehurek.com/gensim/models/doc2vec.html), not from the start, tutorial said:
Make sure you have a C compiler before installing gensim, to use optimized (compiled) doc2vec training (70 x speedup/blog).
Initialize a model with e.g. :
> The model=Doc2Vec (documents, size=100, window=8, min_count=5, workers=4)
Persist a model to disk with:
> Model. The save (fname)
> The model=Doc2Vec. Load (fname) # you can continue training with the the loaded model!
The model can also be instantiated from an existing file on disk in The word2vec format C:
> Model=Doc2Vec. Load_word2vec_format ('/TMP/vectors. TXT, binary=False) # C text format
> Model=Doc2Vec. Load_word2vec_format ('/TMP/vectors. Bin, binary=True) # C binary format
My code is as follows, including negForPy. TXT or so long:
Poor standard room is not as good as 3 stars and facilities are very old. Suggest hotel standard room from the new improvement of the old.
Service attitude is very poor, the receptionist doesn't seem to be trained, don't even understand basic manners, unexpectedly a few guest at the same time; Assistant manager even worse, with the guest argued endlessly, to the general manager's telephone complaints should all dare not to, without anything out, and this need not be afraid, so
Geographic position is good, where are more convenient, but the service doesn't like Howard Johnson group management, poor, and sleep in the afternoon to wash a bath, wanted to let the hotel to cleaning, so opened, please clean the service lamp, but back to the hotel for the night, found a clean service lamp is turned off, and the room is not cleaned,
1, I live on the standard of the road, room facilities, humble, and the room window outdoor and a layer of glass curtain wall, and cannot be opened, in the room can't natural ventilation, daylighting is bad, 2, ate three meal, little varieties and 3, the restaurant on the second floor is rented out, expensive price, the original order of the tenant can offer a ninety percent discount (in the room service guide also clearly written, but wait to checkout of seafood and wine is not to be discounted, and there is no invoice, it is not easy to find good to manager to get the invoice, the next day in aggregate with four-star gap is too big!
1, I live on the standard of the road, room facilities, humble, and the room window outdoor and a layer of glass curtain wall, and cannot be opened, in the room can't natural ventilation, daylighting is bad, 2, ate three meal, little varieties and 3, the restaurant on the second floor is rented out, expensive price, the original order of the tenant can offer a ninety percent discount (in the room service guide also clearly written, but wait to checkout of seafood and wine is not to be discounted, and there is no invoice, it is not easy to find good to manager to get the invoice, the next day in aggregate with four-star gap is too big!
# encoding: utf-8
The from gensim. Models. Doc2vec import doc2vec
With the open (' negForPy. TXT ', 'r') as infile:
The documents=infile. Readlines ()
The model=Doc2Vec (documents, size=100, window=8, min_count=5, workers=4)
The model=Doc2Vec. Load_word2vec_format (' vectors1. TXT, binary=False)
Line 8 and before the code is different two methods, I have a separate run no matter what method will be an error, like the above code run directly, for example, an error is:
Traceback (the most recent call last) :
The File "E: \ zzWorkFiles \ ZZworkspace \ Practise1 \ SRC \ Prac1 \ draft2 py", line 6, the in & lt; module>
Model=Doc2Vec ([" aaaa ", "andajfiaihe", "dfghiah", "adoifjeng]", size=100, window=8, min_count=5, workers=4)
The File "D: \ programFiles \ Python2.7.10 \ lib \ site - packages \ gensim - 0.12.2 - py2.7 - win32. Egg \ gensim \ models \ doc2vec py", line 584, in __init__
Self. Build_vocab (documents, trim_rule=trim_rule)
The File "D: \ programFiles \ Python2.7.10 \ lib \ site - packages \ gensim - 0.12.2 - py2.7 - win32. Egg \ gensim \ models \ word2vec py", line 495, in build_vocab
Self. Scan_vocab (sentences, trim_rule=trim_rule) # initial survey
The File "D: \ programFiles \ Python2.7.10 \ lib \ site - packages \ gensim - 0.12.2 - py2.7 - win32. Egg \ gensim \ models \ doc2vec py", line 627, in scan_vocab
Document_length=len (document. Words)
AttributeError: 'STR' object has no attribute 'words'
In this case how should handle??
Great god answer genuflect is begged!
CodePudding user response:
Gensim. Models. Doc2vec. TaggedLineDocument (file) to deal with the input data, and then plug in the modelCodePudding user response: