SED替换多个第二次出现的字符

我有以下格式的 .srt 文件:

0
1
00:00:01,830 --> 00:00:04,740
corresponding text
1

2
00:00:05,280 --> 00:00:10,280
corresponding text
2

3
00:00:10,740 --> 00:00:14,640
corresponding text
3

4
00:00:15,510 --> 00:00:19,260
corresponding text
4

带有行号的额外行一直贯穿副标题(第 5 行、第 6 行...第 540 行)。我尝试了该命令sed '/^[0-9]/ s/.//',并按预期替换了所有数字,但我不知道如何使其仅替换范围内每个数字的第二次出现。

预期的结果是:

0
1
00:00:01,830 --> 00:00:04,740
corresponding text

2
00:00:05,280 --> 00:00:10,280
corresponding text

3
00:00:10,740 --> 00:00:14,640
corresponding text

4
00:00:15,510 --> 00:00:19,260
corresponding text

我如何使用 sed、awk 或任何可以批量完成的工具来实现它,因为有几个文件具有相同的情况?

谢谢!

回答

$ awk 'BEGIN{FS=OFS=RS;RS=""} {$NF=""}1' file
0
1
00:00:01,830 --> 00:00:04,740
corresponding text

2
00:00:05,280 --> 00:00:10,280
corresponding text

3
00:00:10,740 --> 00:00:14,640
corresponding text

4
00:00:15,510 --> 00:00:19,260
corresponding text

  • @RavinderSingh13 - yes, that makes sense. I had snapped to most of that but did not catch the effect of `$NF=""` -- thinking about it in terms of paragraph and nuking the last field makes perfect sense.

以上是SED替换多个第二次出现的字符的全部内容。
THE END
分享
二维码
< <上一篇
下一篇>>