Home > Blockchain >  How to index a dataframe with a user-defined string?
How to index a dataframe with a user-defined string?

Time:06-26

Assumptions: I have a Julia DataFrame with a column titled article_id.

Normally, I can declare a DataFrame using some syntax like df = DataFrame(CSV.File(dataFileName; delim = ",")). If I wanted to get the column pertaining to a known attribute, I could do something like df.article_id. I could also index that specific column by doing df."article_id".

However, if I created a string and assigned it to the value of article_id, such as str = "article_id", I cannot index the dataframe via df.str: I get an error by doing so. This makes sense, as str is not an attribute of the DataFrame, yet the value of str is an attribute of the DataFrame. How can I index the DataFrame to get the column corresponding to the value of str? I'm looking for some syntax similar to df.valueof(str).

Are there any solutions to this?

CodePudding user response:

From the DataFrames.jl manual's "Getting started" page:

Columns can be directly (i.e. without copying) accessed via df.col, df."col", df[!, :col] or df[!, "col"]. The two latter syntaxes are more flexible as they allow passing a variable holding the name of the column, and not only a literal name.

So you can write df[!, str], and that will be equivalent to df.article_id if str == "article_id".

The Indexing section of the manual goes into even more detail, for when you need more advanced types of indexing or want a deeper understanding of the options.

CodePudding user response:

For an additional reference. When you write: df.colname it is equivalent to writing getproperty(df, :colname). Therefore if you have column name stored in the str variable you can write getproperty(df, str).

However, as Sundar R noted it is usually more convenient to use indexing instead of property access. Two most common patterns are df[!, str] which is equivalent to getproperty(df, str) and gets you a column without copying it and df[:, str] which gets you a copy of a column.

  • Related