Home > other >  TypeError: a bytes - like the object is required, not 'STR'
TypeError: a bytes - like the object is required, not 'STR'

Time:09-18

Fully follow others on the network application analysis of the emotion, but the program also problems:...... Line 101, in words
DegreeDict [d.s plit (' ') [0]]=d.s plit (' ') [1]
TypeError: a bytes - like the object is required, not 'STR'
What python3 coding so trouble? Could you tell me how to solve all this?
 
#! The/usr/bin/env python
# - * - coding: utf-8 - * -
The from the collections import defaultdict
The import OS
The import re
The import jieba
The import codecs
The import sys
The import chardet
The import matplotlib. Pyplot as PLT
The import importlib
Importlib. Reload (sys)

# use jieba functions of sentence text participle

Def sent2word (sentence) :

# call jieba to participle
SegList=jieba. The cut (sentence)

# to save results in segResult after word for the type of list
SegResult=[]
For w in segList:
SegResult. Append (w)

# call the readLines read stop words
Stopwords=readLines (' E://stop_words thesaurus. TXT ')

# if the stop is not saved to the newSent
NewSent=[]
For the word in segResult:
If word + '\ n' in stopwords:
The continue
The else:
NewSent. Append (word)
# returns newSent
Return newSent


# directly to participle sentence without the use of stop words and returns (mainly according to the word needs this operation)
Def returnsegResult (sentence) :

SegResult=[]
SegList=jieba. The cut (sentence)

For w in segList:
SegResult. Append (w)
Return segResult


# get filepath catalogue all the files in a directory and return
Def the eachFile (filepath) :
PathDir=OS. Listdir (filepath)
The child=[]
For allDir pathDir in:
Child. Append (OS) path) join (' % s/% s' % (filepath, allDir)))
Return the child

# filename path each line of data read and returns the converted to GBK
Def readLines (filename) :
Fopen=open (filename, rb, encoding="utf-8")


data=https://bbs.csdn.net/topics/[]
For x in fopen. Readlines () :
If x.s trip ()!=b ':
Data. Append (x.s trip ()) # data. Append (unicode (x.s trip (), "GBK"))

Fopen close ()
The return data


# read filename path of each row of data and return
Def readLines2 (filename) :
Fopen=open (filename, 'rb') # FILE_OBJECT=open (' order. The log ', 'r', encoding="utf-8")
data=https://bbs.csdn.net/topics/[]
For x in fopen. Readlines () :
If x.s trip ()!=' ':
Data. Append (x.s trip ()) # x.s trip ()

Fopen close ()
The return data

See # mainly for emotional orientation program files related to speed code here to extract the part of the code should have apparently had a little effects on speed in the classifyWords and
Def words () :
# emotional words
SenList=readLines2 (' E://BosonNLP_sentiment_score BosonNLP_sentiment_score thesaurus. TXT ')
SenDict=defaultdict ()


# for s in senList:
# senDict [s.s plit () [0]]=s.s plit (' ') [1]
# negative
NotList=readLines2 (' E://notDict thesaurus. TXT ')
# degree adverbs
DegreeList=readLines2 (" E:/sentiment/degreeDict. TXT ")
DegreeDict=defaultdict ()

For d in degreeList:

DegreeDict [d.s plit (' ') [0]]=d.s plit (' ') [1]

Return senDict notList, degreeDict

# (1) emotional words

# see text document according to the emotional positioning for sentences related score
Def classifyWords (wordDict, senDict notList, degreeDict) :

SenWord=defaultdict ()
NotWord=defaultdict ()
DegreeWord=defaultdict ()
For the word in wordDict. Keys () :
If word in senDict. Keys () and word not in notList and word not in degreeDict. The keys () :
SenWord [wordDict [word]]=senDict [word]
Elif word in notList and word not in degreeDict. The keys () :
NotWord [wordDict [word]]=1
Elif word in degreeDict. Keys () :
DegreeWord [wordDict [word]]=degreeDict [word]
Return senWord notWord, degreeWord


See # computing sentence scoring procedure document
Def scoreSent (senWord, notWord degreeWord, segResult) :
W=1
Score=0
# save a list of all the position of the emotional words
SenLoc=senWord. Keys ()
NotLoc=notWord. Keys ()
DegreeLoc=degreeWord. Keys ()
Senloc=1
# notloc=1
# degreeloc=1
# segResult traversal of all words, I absolutely positioned for word
For I in range (0, len (segResult) :
# if the term for emotional words
If I in senLoc:
# loc location list serial number for emotional words
Senloc +=1
# directly add the score emotional words
Score +=W * float (senWord [I])
# print "score=% f" % score
If senloc & lt; Len (senLoc) - 1:
# judgment between the emotional words and the emotion words if there is a negative word or adverbs of degree
# j for absolute position
For j in range (senLoc [senLoc], senLoc [senLoc + 1]) :
# if there is a negative word
If j in notLoc:
W *=1
# if there is a degree adverb
Elif j in degreeLoc:
W *=float (degreeWord [j])
# I position to the next emotional words
If senloc & lt; Len (senLoc) - 1:
I=senLoc [senLoc + 1]
Return score


# list turn to a dictionary
Def listToDist (wordlist) :
data=https://bbs.csdn.net/topics/{}
For x in the range (0, len (wordlist) :
Data [wordlist [x]]=x
The return data

# under the drawing related to baidu
Def runplt () :
PLT. Figure ()
PLT. Title (' test ')
PLT. Xlabel (' x ')
PLT. Ylabel (' y ')
# define the length of the figure here such as article 2000 data will write 0200 0
PLT. Axis ([0, 0100-10, 10])
PLT. The grid (True)
Return PLT




# theme from here are all above methods


# get all files under the test/neg path
Filepwd=the eachFile (" E:/test/neg ")

Score_var=[]


# for local negative emotional words adverbs of degree
Words_vaule=words ()

# loop reads filepwd (that is, the test/neg all files in directory all run)
For x in filepwd:
nullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnull
  • Related