I have some problem with uploding excel data in python.
Excel:
Code used to upload:
import pandas as pd
from google.colab import files
#uploaded = files.upload()
import io
df2 = pd.read_csv(io.BytesIO(uploaded['nodes.csv']),index_col=0)
print (df2)
Result:
Can you kindly help me?
CodePudding user response:
You said you wanted to convert the imported data to a numpy array, you could do so by doing the following without using pandas:
import numpy as np
arr = np.genfromtxt('nodes.csv', delimiter=',')
print(arr)
Check the documentation: https://numpy.org/doc/stable/reference/generated/numpy.genfromtxt.html
CodePudding user response:
If you have a csv-file file.csv
0,0
2,0
4,0
1,1.732051
3,1.732051
then
df = pd.read_csv("file.csv", index_col=0)
does produce
df =
0.1
0
2 0.000000
4 0.000000
1 1.732051
3 1.732051
Why is that: There are two 0
s in the first row and Pandas is mangling the dupes because the row is used for labels. The 0.1
isn't a number, it's a string (print(df.columns)
will show Index(['0.1'], dtype='object')
). If your file would look like
0,1
2,0
4,0
1,1.732051
3,1.732051
then this wouldn't happen, the output would look like
1
0
2 0.000000
4 0.000000
1 1.732051
3 1.732051
If your goal is NumPy array, then
arr = pd.read_csv("file.csv", header=None).values
leads to
array([[0. , 0. ],
[2. , 0. ],
[4. , 0. ],
[1. , 1.732051],
[3. , 1.732051]])