I am creating a program, which is processing and calculating sizes of open-source repositories and libraries, and saving the data to database for further analysis.
- I have an input string:
github.com/Azure/go-ansiterm v0.0.0-20210617225240-d185dfc1b5a1
. - Parsed to a format:
github.com/\!azure/go-ansiterm v0.0.0-20210617225240-d185dfc1b5a1
- Then I parse that into a format
/home/username/dev/glass/tmp/pkg/mod/github.com/\!azure/[email protected]
which is a valid path in my filesystem, where I've downloaded that particular Go Library. - After that, I am passing that path to the
gocloc
-program (Problem is, when I am passing that string as an argument to os/exec, which runs
gocloc
and that path string, it runs command with two escapes - and that's not a valid path.Is there any way to work around this? One idea for me is to just a create shell script on what I want to do
This is the function, which parses
github.com/Azure/go-ansiterm v0.0.0-20210617225240-d185dfc1b5a1
to a formatgithub.com/\!azure/go-ansiterm v0.0.0-20210617225240-d185dfc1b5a1
- and after thats saved into a variable, and the variable has one more escapes, than it should have.func parseUrlToVendorDownloadFormat(input string) string { // Split the input string on the first space character parts := strings.SplitN(input, " ", 2) if len(parts) != 2 { return "" } // Split the package name on the '/' character packageNameParts := strings.Split(parts[0], "/") // Add the '\!' prefix and lowercase each part of the package name for i, part := range packageNameParts { if hasUppercase(part) { packageNameParts[i] = "\\!" strings.ToLower(part) } } // Join the modified package name parts with '/' characters packageName := strings.Join(packageNameParts, "/") return strings.ReplaceAll(packageName "@" parts[1], `\\!`, `\!`) }
After, string is parsed to a format:
/home/username/dev/glass/tmp/pkg/mod/github.com/\!azure/[email protected]
that is passed to this function:
// Alternative goCloc - command. func linesOfCode(dir string) (int, error) { // Run the `gocloc` command in the specified directory and get the output cmd := exec.Command("gocloc", dir) output, err := cmd.Output() if err != nil { return 0, err } lines, err := parseTotalLines(string(output)) if err != nil { return 0, err } return lines, nil }
Which uses this parse function:
// Parse from the GoCloc response. func parseTotalLines(input string) (int, error) { // Split the input string into lines lines := strings.Split(input, "\n") // Find the line containing the "TOTAL" row var totalLine string for _, line := range lines { if strings.Contains(line, "TOTAL") { totalLine = line break } } // If the "TOTAL" line was not found, return an error if totalLine == "" { return 0, fmt.Errorf("could not find TOTAL line in input") } // Split the "TOTAL" line into fields fields := strings.Fields(totalLine) // If the "TOTAL" line doesn't have enough fields, return an error if len(fields) < 4 { return 0, fmt.Errorf("invalid TOTAL line: not enough fields") } // Get the fourth field (the code column) codeStr := fields[3] // Remove any commas from the code column codeStr = strings.Replace(codeStr, ",", "", -1) // Parse the code column as an integer code, err := strconv.Atoi(codeStr) if err != nil { return 0, err } return code, nil }
What I've tried:
- Use gocloc as a library, didn't get it to work.
- Use single quotes instead of escapes, didn't get it to work, but I think there might be something.
One way to get around this, might be to create separate shell script and pass the dir to that as an argument, and get rid of the escapes there, I don't know ...
If you want to observe all the source code: https://github.com/haapjari/glass and more specificly, it's the files https://github.com/haapjari/glass/blob/main/pkg/plugins/goplg/plugin.go and function
enrichWithLibraryData()
and utils functions, which are here: https://github.com/haapjari/glass/blob/main/pkg/plugins/goplg/utils.go (the examples above)Any ideas? How to proceed? Thanks in advance!
CodePudding user response:
I have an input string:
github.com/Azure/go-ansiterm v0.0.0-20210617225240-d185dfc1b5a1
.Parsed to a format:
github.com/\!azure/go-ansiterm v0.0.0-20210617225240-d185dfc1b5a1
.
Your parser seems to have error. I would expect
Azure
to become!azure
:github.com/!azure/go-ansiterm v0.0.0-20210617225240-d185dfc1b5a1
.
To avoid ambiguity when serving from case-insensitive file systems, the $module and $version elements are case-encoded by replacing every uppercase letter with an exclamation mark followed by the corresponding lower-case letter. This allows modules
example.com/M
andexample.com/m
to both be stored on disk, since the former is encoded asexample.com/!m
.