Home > front end >  Combine category with code name [DS code format]
Combine category with code name [DS code format]

Time:01-14

Some DS code systems don't readily support categories. Is this expression the most efficient way to programmatically combine the category with code name?

perl -ne '$data = $_ ; $cat = $1 if $data =~ /CAT (.*)/ ; $cde = $1 if $data =~ /CODE \d (.*)/ ; print "$cat, $cde\n" if /CODE \d /' 'Mario Kart DS (USA).mch'

Example 1 - melonDS, Mario Kart DS (USA).mch

CAT Mission 1 Codes

CODE 0 3 Star Rank - Mission 1-1
223D00C4 0000000F

CODE 0 3 Star Rank - Mission 1-2
223D00C5 0000000F

CAT Mission 2 Codes

CODE 0 3 Star Rank - Mission 2-1
223D00CD 0000000F

CAT Mission 3 Codes

CODE 0 3 Star Rank - Mission 3-1
223D00D6 0000000F

Output:

Mission 1 Codes, 3 Star Rank - Mission 1-1
Mission 1 Codes, 3 Star Rank - Mission 1-2
Mission 2 Codes, 3 Star Rank - Mission 2-1
Mission 3 Codes, 3 Star Rank - Mission 3-1

Regex can't capture the CAT and prepend it to CODE. This was the best expression I could come up with:

perl -0777 -pe 's/CAT (.*)(?s). ?(?-s)(?:CODE \d (.*)(?s). ?(?-s)) (?=CAT|CODE|\z)/\1, \2\n/gi' 'Mario Kart DS (USA).mch'

In order to search and replace, I have to capture each group of CODE preceded by CAT. perl -0777 and (?s)(?-s) allows me to slurp the input file and anchor CODE matches to the initial CAT match while stepping across the end of line. I can repeat the CODE match, as capture group 2, but it will only ever get the last one.

The expression above reads like so: For a line starting with 'CAT ' capture to end of line, step across lines in the least greedy way until we reach CODE. For every group that starts with 'CODE [number] ' capture to the end of line, then step across lines until reaching either CAT, CODE, or the end of file. Repeat the code group as many times as possible.

With example above, this is the output:

Mission 1 Codes, 3 Star Rank - Mission 1-2
Mission 2 Codes, 3 Star Rank - Mission 2-1
Mission 3 Codes, 3 Star Rank - Mission 3-1

CodePudding user response:

Debating what is most efficient or not is perhaps not too interesting in this case. If you have a solution that works, that should perhaps suffice.

Here is another solution, based on paragraph mode.

  • -00: sets input record separator to empty string $/ = '', which enables paragraph mode. Line endings are considered \n\n.
  • -l automatic chomp
  • -E enable say (since there is an interaction with print and -l)

Then just store the header if /^CAT/, else clean up and print.

$ perl -00 -nlwE'if (s/^CAT //) { $k = $_ } else { s/^CODE \d  //; s/\n.*//; say "$k, $_"; }' mission.txt
Mission 1 Codes, 3 Star Rank - Mission 1-1
Mission 1 Codes, 3 Star Rank - Mission 1-2
Mission 2 Codes, 3 Star Rank - Mission 2-1
Mission 3 Codes, 3 Star Rank - Mission 3-1

As a file:

use strict;
use warnings;
use feature 'say';

$/ = '';

my $key;
while (<DATA>) {
    chomp;
    if (s/^CAT //) {
        $key = $_;
    } else {
        s/CODE \d  //;
        s/\n.*//;
        say "$key, $_";
    }
}

CodePudding user response:

To elaborate on the initial question, it's important to note that I know some regex and no Perl, so I don't know what an efficient Perl expression looks like. From my experience, regular expressions are great at capturing 'one this or one that' but we need 'one this and many that'.

If I were talking about the title of a book chapter and each subsequent paragraph, the goal would be to merge the title as the first sentence of each paragraph for each chapter.

A regular expression could capture the title and indent of each paragraph but must limit itself to one chapter at a time. The title becomes capture group 1 while the paragraphs are capture group 2. We can't have 'one and many'; 'one or the other' would return all chapters and paragraphs (as capture group 1 or 2) but wouldn't allow them to be merged together.

Perl language allows this simply by storing the title in a variable to be added as part of the substitution for each paragraph. Since the title occurs first, and only once, per chapter, it can easily be merged in a 'one this many that' situation.

The initial example was flawed in that it was extracting information when it should have removed the categories and merged them with the code names. With that goal, an expression like this would suffice:

perl -pe '$cat = $1 if s/(?:^CAT ([^\v] ).*\n)// ; s/(^CODE \d )/$1$cat, /'

For the non-capture group (?:...) that starts with 'CAT ' store every character that doesn't match the end of line [vertical whitespace] ([^\v] ) up to the end of line .*\n (which captures all modern line endings for Win, MacOS X , and Linux since each ends in \n or linefeed) and remove the entire match including the final linefeed //. This expression captures the category while removing the line.

The next expression (separated by semicolon) captures the phrase 'CODE # ' (^CODE \d ), for each line that matches, then repeats the phrase /$1$cat, / while adding the result of the category variable. This is the result for Example 1:


CODE 0 Mission 1 Codes, 3 Star Rank - Mission 1-1
223D00C4 0000000F

CODE 0 Mission 1 Codes, 3 Star Rank - Mission 1-2
223D00C5 0000000F


CODE 0 Mission 2 Codes, 3 Star Rank - Mission 2-1
223D00CD 0000000F


CODE 0 Mission 3 Codes, 3 Star Rank - Mission 3-1
223D00D6 0000000F

Unfortunately, the melonDS code format insists there be at least one category for the file to be read properly so we'd have to add something generic back in on the first line e.g., CAT Cheats.

A better use case would be a RetroArch formatted cheat file since it doesn't directly support categories. The cheat files that ship with the program use a trick to simulate this in the form of a numbered cheat description that lacks a subsequent code.

Example 2: RetroArch, Mario Kart DS (USA).cht

cheats = 514

cheat0_desc = "Misc Codes"

cheat1_desc = "Freeze Time"
cheat1_code = "621755FC 00000000 B21755FC 00000000 10000000 00000000 D2000000 00000000"
cheat1_enable = false

cheat2_desc = "Start for Final Lap"
cheat2_code = "94000130 FFF70000 023CDD3F 00000001 D2000000 00000000"
cheat2_enable = false

With this expression:

perl -0777 -pe 's|(cheat(\d )_)desc(?=.*\n(?!cheat\2_code))|\1cat|gi' 'Mario Kart DS (USA).cht' | perl -pe '$cat = $1 if s/(?:^cheat\d _cat = \"(.*)\".*\n)// ; s/(^cheat\d _desc = \")/$1$cat, /'

The result is:

cheats = 514


cheat1_desc = "Misc Codes, Freeze Time"
cheat1_code = "621755FC 00000000 B21755FC 00000000 10000000 00000000 D2000000 00000000"
cheat1_enable = false

cheat2_desc = "Misc Codes, Start for Final Lap"
cheat2_code = "94000130 FFF70000 023CDD3F 00000001 D2000000 00000000"
cheat2_enable = false

The expression, from a high level, slurps the input file and for each numbered cheat description cheat0_desc that is not immediately followed by a cheat code name cheat0_code we rename it from cheat0_desc to cheat0_cat then send the changes to the next expression (basically a repeat of the one shown above) that replaces on 'cheat#_desc = "' with itself and the category.

I feel the question was valuable but poorly asked due to lack of knowledge and the continuing learning process.

  • Related