Today we're using .text()
but when html is
<div>
By</div><h2 class="authorh2">John Smith</h2></div>
</div>
Visually on the page, the /div after the word "by" ensures there is a space or a line break.
but when applying cheerio text(), we get as result
ByJohn smith=> which is wrong.
Generally speaking, is it possible to get the text but in a little special way so that ANY html tag is replaced by a white space. (We're ok to trim afterwards all multiple whites spaces ...)
We'd like to have as output By John smith
thanks
This would break legitimate content, such as <a href="">My great link</a>!. Unfortunately, this is not something we can fix on our side.
I know that it would not be consistent with .text() but isn't there another way within the richness of cheerio methods to achieve this ?
The question might need to be rephrased but the point stands, currently facing the same problem trying to get all
tags, there are no spaces or breaks and there is no way to do so without doing specific code per tag since the text itself its in an array and the position changes based on the tags contents
Most helpful comment
I know that it would not be consistent with .text() but isn't there another way within the richness of cheerio methods to achieve this ?