I want to create the clinical
dataframe with a sex
column based on the Sex
column in the raw_clinical_patient
dataframe.
import pandas as pd
raw_clinical_patient = pd.read_csv("./gbm_tcga/data_clinical_patient.txt", sep="\t", header=4) # Skip first 4 rows
clinical = pd.DataFrame()
clinical["sex"] = raw_clinical_patient.loc[:,"Sex"]
clinical["last_fu"] = raw_clinical_patient.loc[:,"Last Alive Less Initial Pathologic Diagnosis Date Calculated Day Value"]
Traceback:
KeyError: 'Sex'
CodePudding user response:
It's case sensitive, so I think there probably is a sex
column in your raw_clinical_patient
data frame rather than a Sex
column.
CodePudding user response:
You may simply write
clinical=raw_clinical_patient[["Sex","Last Alive Less Initial Pathologic Diagnosis Date Calculated Day Value"]]
clinical.columns=['sex','last_fu'] #rename accordingly