Given information of the following form:
target f3 f2 f1 date
1 3 2 1 01/02/2000
0 6 5 4 02/02/2001
1 9 8 7 04/02/2002
1 12 11 10 06/02/2003
1 15 14 13 08/02/2004
1 18 17 16 09/02/2005
0 21 20 19 11/02/2006
1 24 23 22 13/02/2007
0 27 26 25 15/02/2008
1 30 29 28 16/02/2009
1 33 32 31 18/02/2010
1 36 35 34 20/02/2011
1 39 38 37 22/02/2012
1 42 41 40 23/02/2013
1 45 44 43 25/02/2014
and I know from the project domain that the world distribution is closer to the later observations but I still want to learn from the earlier observations. There is a way to prioritize later observations in a model classification task?
CodePudding user response:
Yes there is by passing sample_weight
to the fit
methods.
Have a look at the documentation for some of the classifiers, e.g.
here or here.
In your case you would assign higher weights to recent observations.
There is also a short illustration for the SVM classifier available in this example.