import pandas
df['findings'] = df['findings'].astype(str)
#df['findings'] = df['findings'].astype('string')
df["new_column"] = GPT2_model(df['findings'], min_length=60)
After running this I get the following error, even after converting my dataframe to string.
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-37-1225bf7a7a14> in <module>
----> 1 df["new_column"] = GPT2_model(df['findings'], min_length=60)
5 frames
/usr/local/lib/python3.7/dist-packages/spacy/language.py in _ensure_doc(self, doc_like)
1106 if isinstance(doc_like, bytes):
1107 return Doc(self.vocab).from_bytes(doc_like)
-> 1108 raise ValueError(Errors.E1041.format(type=type(doc_like)))
1109
1110 def _ensure_doc_with_context(
ValueError: [E1041] Expected a string, Doc, or bytes as input, but got: <class 'pandas.core.series.Series'>
CodePudding user response:
Your method/model GPT2_model
doesn't take a Pandas Series
object. That's what the error is complaining about. You can instead apply
the method to your findings
column.
df['new_column'] = df['findings'].apply(GPT2_model, min_length=60)