删除回复:转发:来自邮件主题
我正在尝试设置一个正则表达式来从邮件主题中删除额外的关键字,这些关键字通常是由 Fwd、Re: 等邮件编写者添加的:但无法提出一个可以满足所有这些情况的正则表达式。
Fwd : Re : Re: Many
Re : Re: Many
Re: Re: Many
Re: Many
Re: Many
RE: Presidential Ballots for Florida
RE: (no subject)
Request - should not match anything
this is the subject
Re: Fwd
我在 Java 中尝试过这个正则表达式:
subject.replaceAll("^.{0,3}:s", "");
但这只会删除找到的第一个匹配项。任何正则表达式如果它可以满足大多数常见场景,则并非以上所有内容也会有很大帮助。我找到了一些 Python 的正则表达式,但是将它们转换为 Java 是一件很痛苦的事情。任何帮助表示赞赏。
回答
您可以使用以下方法删除不仅绑定到字符串开头的事件:
b(?:Fwd|Re)bh*(?::h*)?
正则表达式演示
请注意,这也将匹配最后一行Re: Fwd
如果Fwd不应该匹配(所以冒号不是可选的)并绑定到字符串的开头:
^(?:(?:Fwd|Re)h*:h*)+
解释
^字符串的开始(?:非捕获组(?:Fwd|Re)h*:h*在可选的水平空格之间匹配一个Fwd或Re后跟一个冒号
)+关闭非捕获组并重复 1+ 次以获取所有出现次数
正则表达式演示| Java 演示
例子
String regex = "^(?:(?:Fwd|Re)h*:h*)+";
String string = "Fwd : Re : Re: Manyn"
+ "Re : Re: Manyn"
+ "Re: Re: Manyn"
+ "Re: Manyn"
+ "Re: Manyn"
+ "RE: Presidential Ballots for Floridan"
+ "RE: (no subject)n"
+ "Request - should not match anythingn"
+ "this is the subjectn"
+ "Re: Fwd";
Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE | Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(string);
String result = matcher.replaceAll("");
System.out.println(result);
输出
Many
Many
Many
Many
Many
Presidential Ballots for Florida
(no subject)
Request - should not match anything
this is the subject
Fwd