Home > database >  Unstacking with 2 conditions
Unstacking with 2 conditions

Time:02-24

I am using MATLAB 2021b and I have the following data:

 ID = {'a','a','a','a','b','b','b','b'}';
 DATE = [2010,2010,2011,2011,2011,2011,2012,2012]';
 FIELD_ID = {'f1','f2','f1','f2','f1','f2','f1','f2'}';
 VALUE = [1,2,5,6,1,1,7,8]';
 T_before = table(ID,DATE,FIELD_ID,VALUE);

T_before =

8×4 table

 ID      DATE    FIELD_ID    VALUE
_____    ____    ________    _____

{'a'}    2010     {'f1'}       1  
{'a'}    2010     {'f2'}       2  
{'a'}    2011     {'f1'}       5  
{'a'}    2011     {'f2'}       6  
{'b'}    2011     {'f1'}       1  
{'b'}    2011     {'f2'}       1  
{'b'}    2012     {'f1'}       7  
{'b'}    2012     {'f2'}       8 

In reality the table is a lot longer and contains more fields. The latest DATE for a given ID can be different. The column FIELD_ID contains fields and their respective value is in the column VALUE.

What I would like to do is unstack this table in long format to have one row per ID with the fields as colums. I have one condition, I want only to unstack the rows containing the latest value in the field DATE. It should look like the following:

T_after =

2×4 table

 ID      DATE    f1    f2
_____    ____    __    __

{'a'}    2011    5     6 
{'b'}    2012    7     8 

CodePudding user response:

One for loop can do this quickly.

ID = {'a','a','a','a','b','b','b','b'}';
DATE = [2010,2010,2011,2011,2011,2011,2012,2012]';
FIELD_ID = {'f1','f2','f1','f2','f1','f2','f1','f2'}';
VALUE = [1,2,5,6,1,1,7,8]';
T_before = table(ID,DATE,FIELD_ID,VALUE);

% get unique IDs
id_unique = unique(ID);

n = numel(id_unique);
f1 = cell(1,n);
f2 = f1;
dates = f1;

for i = 1:n
    %filter DATE by unique ID, then get the latest date
    filter_id = ismember(ID, id_unique{i});
    date_filtered = DATE(filter_id);
    dates{i} = max(date_filtered);
    
    %filter FIELD_ID and VALUE by unique ID and dates
    filter_date = DATE == dates{i};
    filter = filter_id & filter_date;
    field_id_filtered = FIELD_ID(filter);
    value_filtered = VALUE(filter);
    
    %find F1 and F2
    if strcmp(field_id_filtered{1}, 'f1')
        f1{i} = value_filtered(1);
        f2{i} = value_filtered(2);
    else
        f1{i} = value_filtered(2);
        f2{i} = value_filtered(1);
    end
end

%make the final table
T = table(id_unique(:),dates(:),f1(:),f2(:), 'VariableNames',{'ID','DATE','f1','f2'});
  • Related