Home > Back-end >  Julia DataFrames: How to efficiently convert column data and column name to uppercase/lowercase at o
Julia DataFrames: How to efficiently convert column data and column name to uppercase/lowercase at o

Time:08-01

I have a DataFrame that looks like this:

5×4 DataFrame
 Row │ Col_1      Col_2     Col_a   Col_z  
     │ Float64    Float64   String  String 
─────┼─────────────────────────────────────
   1 │ 0.201256   0.418266  aabbcc  xxyyzz
   2 │ 0.804066   0.136453  aabbcc  xxyyzz
   3 │ 0.442338   0.305655  aabbcc  xxyyzz
   4 │ 0.0676846  0.113499  aabbcc  xxyyzz
   5 │ 0.380939   0.773559  aabbcc  xxyyzz

but with many String columns. What is an efficient (and preferably one-liner) solution to convert both column data and column names to uppercase for only these columns? So to get something like:

5×4 DataFrame
 Row │ Col_1      Col_2     COL_A   COL_Z  
     │ Float64    Float64   String  String 
─────┼─────────────────────────────────────
   1 │ 0.201256   0.418266  AABBCC  XXYYZZ
   2 │ 0.804066   0.136453  AABBCC  XXYYZZ
   3 │ 0.442338   0.305655  AABBCC  XXYYZZ
   4 │ 0.0676846  0.113499  AABBCC  XXYYZZ
   5 │ 0.380939   0.773559  AABBCC  XXYYZZ

CodePudding user response:

if df is your data frame there tare two options.

If you do not need to keep the column order

select(df, Not(names(df, AbstractString)), names(df, AbstractString) .=> ByRow(uppercase) .=> uppercase)

If you need to keep the column order:

select(df, [n => eltype(v) <: AbstractString ? ByRow(uppercase) => uppercase : n for  (n, v) in  pairs(eachcol(df))])

(both solutions assume you do not have missing values in your data as in your question)

  • Related