- 01
- 02
- 03
- 04
- 05
- 06
- 07
- 08
- 09
- 10
- 11
- 12
- 13
for (int i = 0; i < blockNodeSize; i++) {
String blockTitle = subBlock.getElementsByClass("b-results__drugs-title").get(i).text();
String blockData = String.valueOf(subBlock.getElementsByTag("dd").get(i))
//.replace("\n", "")
.replace("<dd>", "")
.replace("</dd>", "")
.replace("<p><i>", "")
.replace("</i></p>", ":")
.replace("<p>", "")
.replace("</p>", "")
.replace("</i>", "")
.replace("<br>", "")
.replace("</br>", "\n");
Ой))
логично. нужно же
Причём \b - это backspace :D
http://stackoverflow.com/questions/590747/using-regular-expressions-to-parse-html-why-not
Regular expressions can only match regular languages but HTML is a context-free language. The only thing you can do with regexps on HTML is heuristics but that will not work on every condition. It should be possible to present a HTML file that will be matched wrongly by any regular expression.
> хуями облажил
И ведь правильно сделал.
расскажи мне про неполноценную машину Тьюринга, неполноценный член общества
-1 This answer draws the right conclusion ("It's a bad idea to parse HTML with Regex") from wrong arguments ("Because HTML isn't a regular language"). The thing that most people nowadays mean when they say "regex" (PCRE) is well capable not only of parsing context-free grammars (that's trivial actually), but also of context-sensitive grammars (see stackoverflow.com/questions/7434272/…)
Регулярки круче конечных автоматов.
ясно, понятно