I have a problem. My following column contains a list
in inside that list there is int
. How could I convert that kind of list
to an int
value?
0 [1]
1 [0]
2 [1]
3 [0]
4 [0]
...
9869 [1]
9870 [1]
9871 [1]
9872 [0]
9873 [0]
Name: predicted, Length: 9874, dtype: object
What I tried
df_testing['predicted'] = df_testing['predicted'].astype('str')
df_testing['predicted'].replace('[','',inplace=True)
df_testing['predicted'].replace(']','',inplace=True)
df_testing.predicted = pd.to_numeric(df_testing.predicted, errors='coerce')
[OUT]
0 NaN
1 NaN
---
df_testing['predicted'] = df_testing['predicted'].apply(lambda x: list(map(int, x)))
[OUT]
ValueError: invalid literal for int() with base 10: '['
--------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Input In [45], in <cell line: 1>()
----> 1 df_testing['predicted'] = df_testing['predicted'].apply(lambda x: list(map(int, x)))
File ~\Anaconda3\lib\site-packages\pandas\core\series.py:4433, in Series.apply(self, func, convert_dtype, args, **kwargs)
4323 def apply(
4324 self,
4325 func: AggFuncType,
(...)
4328 **kwargs,
4329 ) -> DataFrame | Series:
4330 """
4331 Invoke function on values of Series.
4332
(...)
4431 dtype: float64
4432 """
-> 4433 return SeriesApply(self, func, convert_dtype, args, kwargs).apply()
File ~\Anaconda3\lib\site-packages\pandas\core\apply.py:1082, in SeriesApply.apply(self)
1078 if isinstance(self.f, str):
1079 # if we are a string, try to dispatch
1080 return self.apply_str()
-> 1082 return self.apply_standard()
File ~\Anaconda3\lib\site-packages\pandas\core\apply.py:1137, in SeriesApply.apply_standard(self)
1131 values = obj.astype(object)._values
1132 # error: Argument 2 to "map_infer" has incompatible type
1133 # "Union[Callable[..., Any], str, List[Union[Callable[..., Any], str]],
1134 # Dict[Hashable, Union[Union[Callable[..., Any], str],
1135 # List[Union[Callable[..., Any], str]]]]]"; expected
1136 # "Callable[[Any], Any]"
-> 1137 mapped = lib.map_infer(
1138 values,
1139 f, # type: ignore[arg-type]
1140 convert=self.convert_dtype,
1141 )
1143 if len(mapped) and isinstance(mapped[0], ABCSeries):
1144 # GH#43986 Need to do list(mapped) in order to get treated as nested
1145 # See also GH#25959 regarding EA support
1146 return obj._constructor_expanddim(list(mapped), index=obj.index)
File ~\Anaconda3\lib\site-packages\pandas\_libs\lib.pyx:2870, in pandas._libs.lib.map_infer()
Input In [45], in <lambda>(x)
----> 1 df_testing['predicted'] = df_testing['predicted'].apply(lambda x: list(map(int, x)))
ValueError: invalid literal for int() with base 10: '['
CodePudding user response:
Here is one more way to do it, since its a single value in the square bracket, treat it as a string and then strip off the brackets and convert to int (if value is believed to be all int)
df['predicted']=df['predicted'].str.strip('[|]').astype(int)
df
predicted
0 1
1 0
2 1
3 0
4 0
9869 1
9870 1
9871 1
9872 0
9873 0
CodePudding user response:
If you are sure that each row contains a list with elements you can use a lambda function:
df_testing['predicted'] = df_testing['predicted'].apply(lambda x: x[0])
Otherwise you can create a function which checks if there are elements and take the mean for example.
I assumed that the list elements are numeric, otherwise you can also change the type in the function.