I have a txt file as following:
sub_ID: ['sub-01','sub-02']
ses_ID: ['ses-01','ses-01']
mean: [0.3456,0.446]
I want to read this and convert it to a dataframe such as in the image -don't mind the values in mean_e_field column, it's just an example. the values should be the same as in the txt file. desired dataframe
I tried this and got this however I can't transform it to my prefered df :dataframe
data = pd.read_csv(filename, sep=",", header=None) data
I appreaciate your answers in advance.
CodePudding user response:
So, several things here.
The reason why your previous data = pd.read_csv(filename, sep=",", header=None)
did not work is that you've indicated that it should separate on ,
and it treats every single line as a row to be split. So, sub_ID: [ 'sub-01','sub-02' ]
is split to sub_ID: ['sub-01'
and 'sub-02' ]
.
The example data you've provided seems to be in YAML format:
sub_ID: [ 'sub-01','sub-02' ]
ses_ID: [ 'ses-01','ses-01' ]
mean: [ 0.3456,0.446 ]
If it were CSV, the data would look as follows (it does not):
sub_ID,ses_ID,mean
sub-01,ses-01,0.3456
sub-02,ses-02,0.445
To read this data into a dataframe, you will either need to preprocess it into another format (e.g. csv) or read it as YAML into a dict
and pass that to pandas.DataFrame
.
For example:
import yaml
with open("data.txt", "r") as file:
try:
# This returns a dict from the given YAML data.
data = yaml.safe_load(file)
except yaml.YAMLError as exc:
print(exc)
print(data)
# {'sub_ID': ['sub-01', 'sub-02'], 'ses_ID': ['ses-01', 'ses-01'], 'mean': [0.3456, 0.446]}
After that, you can create a DataFrame
from this dict
:
df = pd.DataFrame(data)
df.head()
----- -------- -------- --------
| | sub_ID | ses_ID | mean |
----- -------- -------- --------
| 0 | sub-01 | ses-01 | 0.3456 |
| 1 | sub-02 | ses-02 | 0.446 |
----- -------- -------- --------
as desired.
If you have certain entries that are not valid YAML, you will need to preprocess the data before loading it into pandas.