I am trying to delimit a flat json file in Python 3 (via Jupyter), in order to create an extra column. Pandas automatically reads and produces rows between "...". When I print without a delimiter it reads the file just fine. Here the first four rows:
0 <h1>lorum ipsum|
1 <h2>lorum ipsum|
2
3 <h5>lorum ipsum...
However, I would like to separate an extra column every time json has file a >, but I receive an extensive error I do not understand. What am I doing wrong?
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-38-647ecd72fd56> in <module>
1 import sys
2 import pandas as pd
----> 3 df = pd.read_json('/filepath/doc.json' , delimiter='>', engine='python', header=None)
4 print (df)
~/opt/anaconda3/lib/python3.8/site-packages/pandas/util/_decorators.py in wrapper(*args, **kwargs)
197 else:
198 kwargs[new_arg_name] = new_arg_value
--> 199 return func(*args, **kwargs)
200
201 return cast(F, wrapper)
~/opt/anaconda3/lib/python3.8/site-packages/pandas/util/_decorators.py in wrapper(*args, **kwargs)
297 )
298 warnings.warn(msg, FutureWarning, stacklevel=stacklevel)
--> 299 return func(*args, **kwargs)
300
301 return wrapper
TypeError: read_json() got an unexpected keyword argument 'delimiter'
Code that produces error is:
import pandas as pd
df = pd.read_json('/path/file.json' , delimiter='>', engine='python', header=None)
print (df)
CodePudding user response:
Thanks for both suggestions. Another strange error. It should be fairly easy.
import pandas as pd
df = pd.read_json('/file/doc.json')
test = pd.DataFrame(df.row.str.split('>').tolist()
print (test)
error:
File "<ipython-input-49-0007a192a995>", line 7
print (test)
^
SyntaxError: invalid syntax
CodePudding user response:
AttributeError: 'DataFrame' object has no attribute 'row', 'str', or 'split'. played with all three.
CodePudding user response:
In other words, this is how far I am.
import pandas as pd
df = pd.read_json('filepath/doc.json')
df.str.split(delimiter='>', expand=True)
print (df)
And I receive the error:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-104-aaf69d70b64a> in <module>
2
3 df = pd.read_json('filepath/doc.json')
----> 4 df.str.split(delimiter='>', expand=True)
5
6
~/opt/anaconda3/lib/python3.8/site-packages/pandas/core/generic.py in __getattr__(self, name)
5463 if self._info_axis._can_hold_identifiers_and_holds_name(name):
5464 return self[name]
-> 5465 return object.__getattribute__(self, name)
5466
5467 def __setattr__(self, name: str, value) -> None:
AttributeError: 'DataFrame' object has no attribute 'str'