I have a CSV file that has a column id
. I create a new one (m0
), which contents come from an HTTP call with id
as a parameter:
d['m0'] = d['id'].apply(lambda id: pd.read_json(f"http://localhost:3000/{id}").get('H', {}).get('M0', "X"))
I need to also create columns m1
and m2
in a similar way. I could do
d['m0'] = d['id'].apply(lambda id: pd.read_json(f"http://localhost:3000/{id}").get('H', {}).get('M0', "X"))
d['m1'] = d['id'].apply(lambda id: pd.read_json(f"http://localhost:3000/{id}").get('H', {}).get('M1', "X"))
d['m2'] = d['id'].apply(lambda id: pd.read_json(f"http://localhost:3000/{id}").get('H', {}).get('M2', "X"))
but the HTTP call is very expensive and slow (I have quite a lot of data).
Is there a way to combine all three calls in one?, knowing that this structure of the JSON I get, for a given id
, is
"H": {
"M0": "sjkdhfjkshd",
"M1": "isudfyfsdif",
"M2": "azednbzaebe"
}
CodePudding user response:
You can write a common function, make an HTTP call, extract all required fields and return result as a pandas Series
:
def get_all_fields(row):
h_json = pd.read_json(f"http://localhost:3000/{row['id']}").get('H', {})
return pd.Series([
h_json.get('M0', "X"),
h_json.get('M1', "X"),
h_json.get('M2', "X"),
])
d[['m0', 'm1', 'm2']] = d.apply(lambda row: get_all_fields(row), axis=1)