如何改进正则表达式以便它可以匹配谷歌脚本电子邮件刮板的电子邮件格式?
我正在尝试制作一个电子邮件抓取工具,它可以阅读您的电子邮件并将交易放入谷歌表格中以便于预算。
电子邮件的格式如下:
This is an Alert to help you manage your credit card account ending in 0000.
As you requested, we are notifying you of any charges over the amount of ($USD) 0.01, as specified in your Alert settings. A charge of ($USD) 44.44 at UBER * EATS PENDIN has been authorized on Apr 34, 2073 at 2:27 PM ET.
Do not reply to this Alert.
If you have questions, please call the number on the back of your credit card, or send a secure message from your Inbox on www.bank.com.
To see all of the Alerts available to you, or to manage your Alert settings, please log on to www.bank.com.
我试图只捕获价格 (44.44)、公司 (Uber Eats)、日期(2073 年 4 月 34 日)和时间(美国东部时间下午 2:27)。
我有这个作为我的正则表达式:
/A charge ofsW+w+W+s(.+?(?=at))w+s(.+?(?=has))w+sw+sw+sw+s(.+?(?=at))w+s(.+?(?=ET))/g
然而,尽管它在 regex101 中匹配,但它不再工作。
关于如何让它在谷歌脚本中匹配以便我可以抓取电子邮件的任何想法?其他一切正常
回答
使用您显示的样本,您能否尝试以下操作。在这里使用 PCRE 功能,这将创建 3 个捕获组,您可以根据需要从中获取值。
^(?:As you requested.*$USD)s+)(d+.d+)s+[w]+s+([^ ]*).*?authorized on(.*).$
上述正则表达式的在线演示
说明:为以上添加详细说明。
^(?: ##Matching from starting of value, starting a non-capturing group.
As you requested.*$USD)s+ ##Matching string As you requested. till $USD) spaces here.
) ##Closing non-capturing group here.
(d+.d+) ##1st capturing group has digits DOT digits here.
s+[w]+s+ ##Matching spaces word characters spaces here.
([^ ]*) ##2nd capturing group matches till any spaces(basically Uber value will come here).
.*?authorized on ##Matching everything till authorized on here.
(.*).$ ##Matching everything till last dot comes of the line, time and date basically.
回答
您的正则表达式对我来说看起来不错,我看到的唯一问题是您正在使用global它,您将无法获得匹配的组。如果您删除它,它将正常工作。请参考MDN RegEx.match()
你可以像这样尝试命名组。
const string =`A charge of ($USD) 44.44 at UBER * EATS PENDIN has been authorized on Apr 34, 2073 at 2:27 PM ET.`;
const regEx = /^A charge ofs((?<currency>.+))s(?<amount>d+.?d+) at (?<company>.+) has been authorized on (?<date>.+) at (?<time>.+).$/;
console.log(string.match(regEx).groups)
请在使用命名捕获组之前检查浏览器支持,我可以使用.