We're implementing accessibility on our site. As the content on the site is authorable, one of the requirements asked by the client was to be informed if the header tags are skipped on a page. For eg. if h3 tag is used after h1 instead of h2. Please let me know if this can done in Javascript/Java preferably.
CodePudding user response:
Write a java basic web crawler https://mkyong.com/java/jsoup-basic-web-crawler-example/ and using Jsoup you can do the parsing
// Group of all h-Tags
Elements hTags = doc.select("h1, h2, h3, h4, h5, h6");
// Group of all h1-Tags
Elements h1Tags = hTags.select("h1");
// Group of all h2-Tags
Elements h2Tags = hTags.select("h2");
// ... etc.
Ensure whehter we have multiple H1 that is also problemtaic in SEO
CodePudding user response:
You can use document.querySelectorAll("h1, h2, h3, h4, h5, h6")
and then analyze the resulting node list.
I am not sure how your desired output looks like. This code snippet will return an array with the numbers of all that are skipped. Also it only expects one h1, h2, ... respectively.
const headings = document.querySelectorAll("h1, h2, h3, h4, h5, h6")
const getHeadingNumber = (heading) => parseInt(heading.tagName[1])
let lastHeadingNumber;
const skippedHeadings = []
for (let heading of headings) {
if(!lastHeadingNumber) {
lastHeadingNumber = getHeadingNumber(heading)
continue;
}
const currentHeadingNumber = getHeadingNumber(heading)
const expectedHeadingNumber = lastHeadingNumber 1
if(expectedHeadingNumber !== currentHeadingNumber) {
skippedHeadings.push(expectedHeadingNumber)
}
lastHeadingNumber = currentHeadingNumber;
}
console.log(skippedHeadings)
<h1></h1>
<h3></h3>
<h5></h5>
CodePudding user response:
To check for skipped heading levels you need to iterate through the headings on a page and check that they either:
- stay the same (same level)
- decrease (back up one or more levels)
- increase (by only one).
If none of the above are true then you have an error.
The below snippet will check for all of these and then return an array of items for you to debug.
The fields returned are:
- previousHeadingLevel: the previous correct heading level so you can check nesting (useful to see if the error is actually that this heading level is incorrect and should be one level lower!),
- seenHeadingLevel: the heading level we saw,
- expectedHeadingLevel: the heading level we were expecting to see,
- seenHeadingText: the text of the heading that has skipped a heading level so you can find it,
- previousHeadingText: the text of the previous heading so you can find it easily on the page.
Usage
To use the following snippet you can call it with checkheadings()
to check the entire document.
Alternatively you can call it while passing in a valid querySelector
variable such as .myClass
to only check within a particular element (useful for complex documents).
Please note: As it is technically valid to have more than one <h1>
in a document this does not check for that, however it is worth noting that it is considered a good practice to only have one <h1>
per page.
Code Example
function getHeadingLevel(heading) {
return parseInt(heading.tagName[1]);
}
function createError(expectedHeadingLevel, previousHeadingLevel, previousHeadingText, currentHeadingLevel, currentHeadingText){
var err = {};
err.previousHeadingLevel = (previousHeadingLevel == 0) ? "no previous headings" : previousHeadingLevel;
err.seenHeadingLevel = currentHeadingLevel;
err.expectedHeadingLevel = expectedHeadingLevel;
err.seenHeadingText = currentHeadingText;
err.previousHeadingText = previousHeadingText;
return err;
}
function checkheadings(container) {
container = container || 'body';
var cont = document.querySelector(container);
var headings = cont.querySelectorAll("h1, h2, h3, h4, h5, h6");
var errors = [];
var previousHeadingLevel = 0;
var previousHeadingText = "Document Start - no headings yet";
if (headings.length < 1) {
errors.push("no headings detected");
return errors;
}
for (x = 0; x < headings.length; x ) {
var currentHeadingLevel = getHeadingLevel(headings[x]);
var currentHeadingText = headings[x].textContent;
var expectedHeadingLevel = previousHeadingLevel 1;
if (currentHeadingLevel > expectedHeadingLevel) {
//errors.push("H" expectedLevel " skipped. Previous correct heading was H" previousHeadingLevel " containing text:" previousHeadingText ". Unexpected H" currentHeadingLevel " text: " currentHeadingText);
errors.push(createError(expectedHeadingLevel, previousHeadingLevel, previousHeadingText, currentHeadingLevel, currentHeadingText));
continue;
}
previousHeadingLevel = currentHeadingLevel;
previousHeadingText = currentHeadingText;
}
return errors;
}
console.log("1 error", checkheadings('.test1'));
console.log("3 errors", checkheadings('.test2'));
console.log("no errors", checkheadings('.test3'));
console.log("entire document", checkheadings());
<div class="test1">
<h1>h1-1</h1>
<h2>h2-1</h2>
<h4>h4-1 - skipped</h4>
<h3>h3-1</h3>
<h2>h2-2</h2>
<h3>h3-2</h3>
<h2>h2-2</h2>
<h2>h2-3</h2>
<h3>h3-3</h3>
</div>
<div class="test2">
<h2>h2 - skipped</h2>
<h1>h1-1</h1>
<h4>h4-1 - skipped</h4>
<h3>h3-1 - skipped</h3>
</div>
<div class="test3">
<h1>h1-1</h1>
<h2>h2-1</h2>
<h3>h3-1</h3>
<h2>h2-2</h2>
<h3>h3-2</h3>
<h2>h2-3</h2>
<h3>h3-3</h3>
<h4>h4-1</h4>
<h5>h5-1</h5>
</div>