Home > Software design >  Perl nested hashes matching and merging
Perl nested hashes matching and merging

Time:10-10

I have a file that is read and split into %ojects, the %objects are populated as shown below.

$VAR1 = 'cars';
$VAR2 = {
          'car1' => {
                        'info1' => '"fast"',
                        'info2' => 'boring'
                      },
          'car2' => {
                        'info1' => '"slow"',
                        'info2' => 'boring info'
                      },
          'car3' => {
                        'info1' => '"unique"',
                        'info2' => 'useless info'
                      }
                };
$VAR3 = 'age';
$VAR4 = {
          'new' => {
                                  'info3' => 'rust',
                                  'info4' => '"car1"'
                                },
          'old' => {
                                  'info3' => 'shiny',
                                  'info4' => '"car2" "car3"'
                                }
                   }
         };              

My goal is to insert data like "car1 fast rust, car2 slow shiny, car3 unique shiny" in a DB but I can't get e.g. "rust to match based on info4 in age" ..

my $key = cars;
my $key2 = age;

foreach my $obj (keys %{$objects{$key}}) {                          # for every car
    @info1s = $objects{$type}{$obj}{'info1'} =~ m/"(.*?)"/g;        # added to clean up all info1
    foreach my $infos ($info1s) {
        dbh execute insert $obj $infos                              # this gives me "car1 fast, car2 slow, car3 unique"
    } 
...

Can somebody please point me in the right direction to fetch and store info4 with related info1/info2?

Thanks!

CodePudding user response:

I take the objective to be as follows. Get values for (info4) keys in $VAR4, at the deepest-level hashref, and find them as top-level keys in $VAR2 hashref. Then associate with them both a value from a (info3) key, their "sibling" in their own $VAR4's deepest level hashref, as well as the value of a key (info1) from $VAR2.

One can traverse the structure by hand for this purpose, specially if it's always with the same two levels as shown, but it's easier and better with libraries. I use Data::Leaf::Walker to get leaves (deepest values) and key-paths to them, and Data::Diver to get values for known paths.

use warnings;
use strict;
use feature 'say';
use Data::Dump;    
use Data::Leaf::Walker;
use Data::Diver qw(Dive);

my $hr1 = {
    'car1' => { 'info1' => 'fast',   'info2' => 'boring' },
    'car2' => { 'info1' => 'slow',   'info2' => 'boring info' },
    'car3' => { 'info1' => 'unique', 'info2' => 'useless info' }
};
my $hr2 = {
    'new' => { 'info3' => 'rust',  'info4' => 'car1' },
    'old' => { 'info3' => 'shiny', 'info4' => 'car2 car3' }
};

my $walker = Data::Leaf::Walker->new($hr2);    
my %res;    
while ( my ($path, $value) = $walker->each ) { 
    next if $path->[-1] ne 'info4';

    # Some "values" have multiple needed values separated by space
    for my $val (split ' ', $value) { 
        # Get from 'info4' path the one to its sibling, 'info3'
        my @sibling_path = ( @{$path}[0..$#$path-1], 'info3' );

        # Collect results: values of `info3` and `info1`
        push @{$res{$val}}, 
            Dive( $hr2, @sibling_path   ), 
            Dive( $hr1, ($val, 'info1') );
    }
}
dd \%res;

This assumes a few things and takes some shortcuts, for simplicity.

For one, I use explicit infoN keys from the questions, and the two-level structure. If data is, or can be different, this shouldn't be hard to adjust.

Next, this assumes that a value like car1 always exists as a key in the other hashref. Add an exists check for that key if it is possible that it doesn't exist as a key.

I've removed some extra quotes from data. (If that's for database entry do that when constructing the statement. If data comes in with such extra quotes it should be easy to adjust the code to take them into account.)

The above program prints

{
  car1 => ["rust", "fast"],
  car2 => ["shiny", "slow"],
  car3 => ["shiny", "unique"],
}

(I use Data::Dump to display complex data structure, for its simplicity and default compact output.)

  • Related