I have a dataframe of this style:
id patient_full_name
7805 TOMAS FRANCONI
7810 Camila Gualtieri
7821 Lola Borrego
7823 XIMENA ALVAREZ LANUS
7824 MONICA VIVIANA RODRIGUEZ DE MARENGO
I need to save the first name of values from the second column. I want to trim that value down to the first spacing and I don't know how.
I would like it to stay in a structure like this:
patients_names = ["TOMAS","CAMILA","LOLA","XIMANA","MONICA",...."N-NAME"]
All this done in Pandas Python
CodePudding user response:
You can use the split function in a list comprehension to do this:
df = pd.DataFrame([
{"id": 7805, "patient_full_name": "TOMAS FRANCONI"},
{"id": 7810, "patient_full_name": "Camila Gualtieri"},
{"id": 7821, "patient_full_name": "Lola Borrego"}
])
df["first_name"] = [n.split(" ")[0] for n in df["patient_full_name"]]
That adds a column (first_name
) with the output you wanted, which you can then pull off as a list or series if you want:
first_name_as_series = df["first_name"]
first_name_as_list = list(df["first_name"])
In your question, you show the desired output in all upper case. That's easy to get with a simple tweak to the list comprehension:
df["first_name"] = [n.split(" ")[0].upper() for n in df["patient_full_name"]]
CodePudding user response:
You can do it by using extract as well, which do not rely on a loop:
(df
.assign(first_name=lambda x: x.fullname.str.extract(r"(.*) "))
)