I am trying to pickle certain data so I have an easier time retrieving it. My code looks like this:
import pickle
import networkx as nx
import pandas as pd
import numpy as np
import load_data as load
# load the graph
g = load.local_data()
for node in g.nodes():
# get node degree
pickle.dump(g.degree(node), open("./pickles/degree.pickle", "wb"))
# get in-degree of node
pickle.dump(g.in_degree(node), open("./pickles/indegrees.pickle", "wb"))
# get out-degree of node
pickle.dump(g.out_degree(node), open("./pickles/outdegrees.pickle", "wb"))
# get clustering coefficients of node
pickle.dump(nx.clustering(g, node), open("./pickles/clustering.pickle", "wb"))
I have tried printing the commands and they deliver a full list of all nodes and their attributes. However when I open the pickled file it has only stored one single integer. Does anyone know why that may be?
CodePudding user response:
In the current code, you are asking python to open and write something to the pickle as you iterate over the nodes. This ends up overwriting what was already stored in the pickle file every iteration.
What you might want to do instead is:
with open("./pickles/degree.pickle", "wb") as f:
obj = [g.degree(node) for node in g]
pickle.dump(obj, f)
...
Depending on your ultimate objective, it might be better to store results in a csv or some other format that can be shared safely between computers. (perhaps using one of the formats that pandas
supports)