Home > other >  About pandas reads the chunk cannot obtain raw data first message
About pandas reads the chunk cannot obtain raw data first message

Time:10-03

Consult everybody a great god, and I was in a CSV file read choose to use the chunk to reads the data, but I found that the chunk does not get the first line of the chunk of data, seems to be the first data by default for the chunk _info_axis,
Want to consult everybody a great god this kind of situation is there a way to make me the chunk block to obtain the first data
?The code is as follows:
With the open (' OD2011_ForCluster. CSV ', 'w') as csv_file:
Writer.=the CSV writer (csv_file)
Reader=pd. Read_csv (r 'OD2011_ALL. CSV', the iterator=True, encoding='GBK', skiprows=None)
Loop=True
ChunkSize=8000000 # 4000000
The while loop:
Try:
CalculateNum=[',
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0]
The chunk=reader. Get_chunk (chunkSize)
For I in range (0, len (the chunk) - 2) :
If (the chunk. Iat [I, 1)==the chunk. The iat [I + 1, 1]) :
KeyHour=int (the chunk. Iat [6] I, - 19) * 24 + (the chunk. The iat [8] I,/3600)

CalculateNum [KeyHour]=CalculateNum [KeyHour] + 1
The else:
CalculateNum [0]=the chunk. Iat [I, 1)
Writer. Writerow (CalculateNum)
CalculateNum. The clear ()
CalculateNum=[',
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0]
Except StopIteration:
Loop=False
Print (" the Iteration is stopped. ")
Print ('!!!!!!!!!!!!!!! ')

In the debugging information:
The chunk: 1 020089614 shopping park station 9.0 che station 11.0 22 502 75997
0 2 20089614 che station 11.0 shopping park station 9.0 22 601 78103
1, 3, 20089627 new ridge station 62.0 guomao station 2.0 21 621 55352
2, 4, 20089659 lake bei station

Obviously the chunk as the first data column names, consult to solve the problem

CodePudding user response:

Is not a parameter header=None?

CodePudding user response:

Try setting the header=None
  • Related