Home > OS >  How to bring pytorch datasets into pandas dataframe
How to bring pytorch datasets into pandas dataframe

Time:02-02

I have seen a lot of code on how to convert pandas data to pytorch dataset. However, I haven't found or been able to figure out to do the reverse. i.e. Load pytorch dataset into pandas dataframe. I want to load AG news into pandas. Can you please help? Thanks.

from torchtext.datasets import AG_NEWS

CodePudding user response:

You can use:

from torchtext.datasets import AG_NEWS

train, test = AG_NEWS()
df_train = pd.DataFrame(train, columns=['label', 'text'])
df_test = pd.DataFrame(test, columns=['label', 'text'])

Output:

>>> df_train.head()
   label                                               text
0      3  Wall St. Bears Claw Back Into the Black (Reute...
1      3  Carlyle Looks Toward Commercial Aerospace (Reu...
2      3  Oil and Economy Cloud Stocks' Outlook (Reuters...
3      3  Iraq Halts Oil Exports from Main Southern Pipe...
4      3  Oil prices soar to all-time record, posing new...


>>> df_test.head()
   label                                               text
0      3  Fears for T N pension after talks Unions repre...
1      4  The Race is On: Second Private Team Sets Launc...
2      4  Ky. Company Wins Grant to Study Peptides (AP) ...
3      4  Prediction Unit Helps Forecast Wildfires (AP) ...
4      4  Calif. Aims to Limit Farm-Related Smog (AP) AP...
  • Related