Im trying to filter a dataframe by date. But filter it with expressions like this would be really cumbersome for a date like "2019-11-01 10:15:00".
My goal is to do something like the python version:
use polars::export::chrono::NaiveDateTime;
use polars::prelude::*;
fn main() -> Result<(), Box<dyn std::error::Error>> {
let df = LazyCsvReader::new(path)
.with_parse_dates(true)
.has_header(true)
.finish()?
.collect()?;
let dt = NaiveDateTime::parse_from_str("2019-11-01 10:15:00", "%Y-%m-%d %H:%M:%S")?;
//This will not compile!
let filtered = df.filter(col("time") < dt);
}
However I'm having a really hard time to filter the dateframe in-place or just creating a boolean mask.
CodePudding user response:
After more time than I dare to admit I finally solved it by using the eager API, there is probably a better solution in the Lazy-API but this works for now!
use polars::export::chrono::NaiveDateTime;
use polars::prelude::*;
fn main() -> Result<(), Box<dyn std::error::Error>> {
let df = LazyCsvReader::new(path)
.with_parse_dates(true)
.has_header(true)
.finish()?
.collect()?;
// Set date to filter by
let dt = NaiveDateTime::parse_from_str("2019-11-01 10:15:00", "%Y-%m-%d %H:%M:%S")?;
// Create boolean mask
let mask = df["time"]
.datetime()?
.as_datetime_iter()
.map(|x| x.unwrap() < dt)
.collect();
// New filtered df
let filtered_df = df.filter(&mask)?;
}
To get a date value from the "time" column and parse it to as a NaiveDateTime:
fn main() -> Result<(), Box<dyn std::error::Error>> {
// Lets take the last date from a series of datetime[µs]
let date: Vec<Option<NaiveDateTime>> = df["time"]
.tail(Some(1))
.datetime()?
.as_datetime_iter()
.collect();
// Create new NaiveDateTime, can be used as filter/condition in map-function
let dt2 = NaiveDateTime::parse_from_str(&date[0].unwrap().to_string(), "%Y-%m-%d %H:%M:%S")?;
}