I have a list of strings, and in some, there contains a caret with several random numbers after.
Ex:
strings = [MyString1^111,
MyString2,
MyString3,
MyString4^222,
MyString5^888]
The ultimate goal of my program is to remove the caret, but keep the numbers afterwards, and place the strings into a pandas DataFrame, using AverageTime values found in a "dictionary DF", and if not in the "dictionary DF", utilizes 2.91. I attempted to use the replace function, and while it works to eliminate the caret in the list of strings, it fails to be useable in the DataFrame.
Here is what the "dictionary DF" looks like:
dictionary = [MyString2 : 3.76,
MyString3 : 2.66]
The two columns in the dictionary are "Name_Of_String", and "AverageTime"
Here's what I have so far:
noCaret = []
for i in strings:
noCaret.append(i.replace('^', ''))
stringsDF = dictionary[dictionary.Name_Of_String.isin(noCaret)]
for i in noCaret:
if stringsDF['Name_of_String'].str.contains(i).any():
pass
else:
stringsDF.loc[(len(testDFu.Name_Of_String))-1] = [i, np.nan]
stringsDF.fillna(2.91, inplace = True)
stringsDF
stringsDF = [MyString1^111 : 2.91,
MyString2 : 3.76]
When I run this, I receive a partial DataFrame, none of which contain the strings that have carets. How do I resolve this? Thanks!
Edit: I included what the "dictionary DF", and what the outputted stringsDF looks like, and the column names.
CodePudding user response:
This produces the result you are looking for, I think: a dataframe with 2 columns, Name_of_String
and AverageTime
, all items in strings
are included, with those that are not in the dictionary with AverageTime
as 2.91.
Be careful when typing your code, you have switched between Name_of_String
and Name_Of_String
in your question, which will produce errors (if they are supposed to be the same column). Also, dictionaries use {}
not []
, which cannot take key: value
pairs.
import pandas as pd
strings = ['MyString1^111',
'MyString2',
'MyString3',
'MyString4^222',
'MyString5^888']
noCaret = [x.replace('^', '') for x in strings]
dictionary = {"MyString2": 3.76, "MyString3": 2.66}
stringsDF = pd.DataFrame(data={"Name_of_String": noCaret})
stringsDF["AverageTime"] = stringsDF["Name_of_String"].map(dictionary).fillna(2.91)
stringsDF
#Out:
# Name_of_String AverageTime
#0 MyString1111 2.91
#1 MyString2 3.76
#2 MyString3 2.66
#3 MyString4222 2.91
#4 MyString5888 2.91
CodePudding user response:
import pandas as pd
new_strings = []
strings = ['MyString1^111',
'MyString2',
'MyString3',
'MyString4^222',
'MyString5^888']
for i in strings:
new_strings.append(i.replace('^', ''))
df = pd.DataFrame(new_strings)
print(df)
Output:
0 MyString1111
1 MyString2
2 MyString3
3 MyString4222
4 MyString5888