I'm trying to move data from local to hdfs using jupyter after the Data cleaning, i found some issues while doing it, and the data won't move into hdfs ( hdfs & jupyter deployed in minikube k8s)
This is the code in jupyter :
writer = pd.ExcelWriter("data.xlsx")
data.to_excel( excel_writer=writer)
writer.save("hdfs://hdfs-namenode-0.hdfs-namenode.default.svc.cluster.local/data")
The error is :
save() takes 1 positional argument but 2 were given
CodePudding user response:
This is how i solved my problem :
Client = InsecureClient('http://hdfs-namenode.default.svc.cluster.local:50070', user='hdfs')
data = pd.read_csv('name_of_file.csv')
with client.upload('path/name_of_file.csv' , 'name_of_file.csv', n_threads=1, temp_dir=None) as writer :
data.to_csv(writer)