I am trying to create a regex that finds text in a markdown file. Basically, I have "tasks" marked with the - [ ]
or - [x]
characters (undone or done) and project headers (marked with ##
). I would like to find all undone tasks and their project names.
For example, for this sample text:
# Top of File
## Project A
Descriptive line
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec finibus elit non nibh lobortis molestie.
- [ ] an undone task
- [x] a completed task
- [x] second completed task
## Project B
Descriptive line
- [x] a completed task
- [ ] an uncompleted task
## Project C
Descriptive line
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec finibus elit non nibh lobortis molestie.
- [x] completed task
- [ ] uncompleted task
- [x] completed task
I would like to return:
Project A, an undone task
Project B, an uncompleted task
Project C, uncompleted task
This maybe gets close, but I will have variable amounts of tasks and the regex wants to know how many lines to look ahead, and it's too variable. ((.*(\n|\r|\r\n)){5})\- \[ \]
CodePudding user response:
We can try using match()
here to alternatively find the project or incomplete lines. Then, do a reduction to combine the two matching lines by a comma separator.
var input = `# Top of File
## Project A
Descriptive line
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec finibus elit non nibh lobortis molestie.
- [ ] an undone task
- [x] a completed task
- [x] second completed task
## Project B
Descriptive line
- [x] a completed task
- [ ] an uncompleted task
## Project C
Descriptive line
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec finibus elit non nibh lobortis molestie.
- [x] completed task
- [ ] uncompleted task
- [x] completed task`;
var lines = input.match(/## (.*)|- \[ \] (.*)/g)
.map(x => x.match(/\w (?: \w )*/g)[0]);
var output = [];
var i=0;
while (i < lines.length) {
output.push(lines[i] ", " lines[i 1]);
i = 2;
}
console.log(output);
Here is an explanation of the regex pattern used to find the matching lines:
## (.*)
match and capture the project text|
OR- \[ \] (.*)
match and capture the incomplete text
But the match()
function will return the leading portion (e.g. ##
) which we don't want. So I added a map()
step which removes this leading content. Finally, we iterate the array of lines and combine in order with a comma.