Howdoesshellselectcontentwithinthekeywordrange?

html5 • 2022年11月22日 am9:26 • 问答

这是一个 HTML 文件，在 HTML 文件中包含大量<section>... </section>内容，其格式如下。

<!DOCTYPE html>
<html>
<head>
<title>Page Title</title>
</head>
<body>

<section>
<div>
<header><h2>This is a title (RfQVthHm)</h2></header>
More HTML codes...
</div>
</section>

<section>
<div>
<header><h2>This is a title (UaHaZWvm)</h2></header>
More HTML codes...
</div>
</section>

<section>
<div>
<header><h2>This is a title (vxzbXEGq)</h2></header>
More HTML codes...
</div>
</section>

</body>
</html>

我需要提取第二个<section>...</section>内容。

这是预期的输出。

<section>
<div>
<header><h2>This is a title (UaHaZWvm)</h2></header>
More HTML codes...
</div>
</section>

我注意到我可以先查找UaHaZWvm字符（以及前面 2 行），直到遇到下一个</section>.

OP的努力（在评论中提到）： grep -o "hi.*bye" file

可以用这个来完成awk，sed或者grep工具讨好？

回答

由于您正在使用 HTML，因此使用能够识别格式的工具（例如或其他一些允许您使用 XPath 表达式来提取部分文档的程序）更简单也更好xmllint：

$ xmllint --html --xpath '//section[2]' input.html 2>/dev/null
<section>
<div>
<header><h2>This is a title (UaHaZWvm)</h2></header>
More HTML codes...
</div>
</section>

（xmllint给出了很多关于标签的错误；我不认为它真的支持HTML5？无论如何，这就是为什么上面有标准错误的重定向。）

hxselect来自 W3C 的HTML-XML-utils程序集合的替代使用。它没有使用 XPath，而是使用 CSS 选择器来指定从文档中获取的内容：

hxselect 'section:nth-child(2)' < input.html

以上是Howdoesshellselectcontentwithinthekeywordrange?的全部内容。

THE END

二维码

如何更改ElevatedButtonFlutter中的onPressed高度

< <上一篇

如何比较类对象？

下一篇>>

搜索内容

Howdoesshellselectcontentwithinthekeywordrange?

回答

目录

目录

推荐文章

最新文章