Reading from Beam dataframe overview, it looks like I can just convert a beam PCollection to a DataFrame using
from apache_beam.dataframe.convert import to_dataframe
df = to_dataframe(pcollections)
However, the df
is still a Beam DataFrame not a pandas DataFrame. Is it possible to convert it to a pure pandas DataFrame?
(The dataframe in my problem is small enough to fit into the memory of single machine. Also, Beam DataFrame miss some critical feature so I still need the pure pandas functionality.)
CodePudding user response:
import pandas as pd
df = pd.DataFrame(data = YOUR DATAFRAME HERE)
print(df)
CodePudding user response:
import pandas as pd from pandas import DataFrame
df= DataFrame("")