Assumptions: I have a Julia DataFrame with a column titled article_id
.
Normally, I can declare a DataFrame using some syntax like df = DataFrame(CSV.File(dataFileName; delim = ","))
. If I wanted to get the column pertaining to a known attribute, I could do something like df.article_id
. I could also index that specific column by doing df."article_id"
.
However, if I created a string and assigned it to the value of article_id
, such as str = "article_id"
, I cannot index the dataframe via df.str
: I get an error by doing so. This makes sense, as str
is not an attribute of the DataFrame, yet the value of str
is an attribute of the DataFrame. How can I index the DataFrame to get the column corresponding to the value of str
? I'm looking for some syntax similar to df.valueof(str)
.
Are there any solutions to this?
CodePudding user response:
From the DataFrames.jl manual's "Getting started" page:
Columns can be directly (i.e. without copying) accessed via df.col, df."col", df[!, :col] or df[!, "col"]. The two latter syntaxes are more flexible as they allow passing a variable holding the name of the column, and not only a literal name.
So you can write df[!, str]
, and that will be equivalent to df.article_id
if str == "article_id"
.
The Indexing section of the manual goes into even more detail, for when you need more advanced types of indexing or want a deeper understanding of the options.
CodePudding user response:
For an additional reference. When you write:
df.colname
it is equivalent to writing getproperty(df, :colname)
. Therefore if you have column name stored in the str
variable you can write getproperty(df, str)
.
However, as Sundar R noted it is usually more convenient to use indexing instead of property access. Two most common patterns are df[!, str]
which is equivalent to getproperty(df, str)
and gets you a column without copying it and df[:, str]
which gets you a copy of a column.