From 33992133285e309b77c8a6f04e3a40ecba139e56 Mon Sep 17 00:00:00 2001 From: Hongbo Wu Date: Wed, 27 Sep 2023 15:32:42 +0800 Subject: [PATCH] add test cases from economist and caixin --- packages/readabilityjs/Readability.js | 4 + .../test-pages/caixin/expected-metadata.json | 11 + .../test/test-pages/caixin/expected.html | 45 + .../test/test-pages/caixin/source.html | 2268 +++++++++++++++++ .../test/test-pages/caixin/url.txt | 1 + .../economist/expected-metadata.json | 12 + .../test/test-pages/economist/expected.html | 39 + .../test/test-pages/economist/source.html | 1385 ++++++++++ .../test/test-pages/economist/url.txt | 1 + 9 files changed, 3766 insertions(+) create mode 100644 packages/readabilityjs/test/test-pages/caixin/expected-metadata.json create mode 100644 packages/readabilityjs/test/test-pages/caixin/expected.html create mode 100644 packages/readabilityjs/test/test-pages/caixin/source.html create mode 100644 packages/readabilityjs/test/test-pages/caixin/url.txt create mode 100644 packages/readabilityjs/test/test-pages/economist/expected-metadata.json create mode 100644 packages/readabilityjs/test/test-pages/economist/expected.html create mode 100644 packages/readabilityjs/test/test-pages/economist/source.html create mode 100644 packages/readabilityjs/test/test-pages/economist/url.txt diff --git a/packages/readabilityjs/Readability.js b/packages/readabilityjs/Readability.js index 353270c1c..08ed19959 100644 --- a/packages/readabilityjs/Readability.js +++ b/packages/readabilityjs/Readability.js @@ -1073,6 +1073,10 @@ Readability.prototype = { }, _checkPublishedDate: function (node, matchString) { + if (this._articlePublishedDate) { + return false; + } + // Skipping meta tags if (node.tagName.toLowerCase() === 'meta') return // return published date if the class name is 'omnivore-published-date' which we added when we scraped the article diff --git a/packages/readabilityjs/test/test-pages/caixin/expected-metadata.json b/packages/readabilityjs/test/test-pages/caixin/expected-metadata.json new file mode 100644 index 000000000..d18524c51 --- /dev/null +++ b/packages/readabilityjs/test/test-pages/caixin/expected-metadata.json @@ -0,0 +1,11 @@ +{ + "title": "途虎养车港交所挂牌 腾讯为最大外部股东", + "byline": "文|财新 余聪", + "dir": null, + "excerpt": "途虎养车 腾讯国内汽车服务市场高度分散,2022年,途虎养车取得汽车服务收入115亿元,市场份额0.9%", + "siteName": "fakehost", + "previewImage": "https://img.caixin.com/2023-09-26/169572084568190_560_373.jpg", + "publishedDate": "2023-09-26T00:00:00.000Z", + "language": "English", + "readerable": true +} diff --git a/packages/readabilityjs/test/test-pages/caixin/expected.html b/packages/readabilityjs/test/test-pages/caixin/expected.html new file mode 100644 index 000000000..feb2717e5 --- /dev/null +++ b/packages/readabilityjs/test/test-pages/caixin/expected.html @@ -0,0 +1,45 @@ +
+
+
+

途虎养车港交所挂牌 腾讯为最大外部股东 +

+ + +
+ +

文|财新 余聪

+

2023年09月26日 17:22

+ + + +

试听

+
+

国内汽车服务市场高度分散,2022年,途虎养车取得汽车服务收入115亿元,市场份额0.9%

+
+
+

  【财新网】9月26日,汽车服务平台途虎养车正式在港交所主板挂牌上市。途虎养车( 09690.HK )上市发行价为28港元/股,此前公司披露的发行价区间为28港元/股至31港元/股,即实际发行价为区间下限。当日,途虎养车收报29.5港元/股,较发行价涨5.36%,市值为239.6亿港元。

+

  途虎养车上市不易。途虎养车2022年1月即在港交所递表,2022年8月、2023年3月两次重新递交上市申请材料,终于在2023年8月23日通过聆讯。

+
+ + +
+

+

登录 后获取已订阅的阅读权限

+ + + + + + +
+
+

+
+ + +

  推荐进入财新数据库,可随时查阅公司股价走势、结构人员变化等投资信息。

+

责任编辑:屈运栩 | 版面编辑:刘潇(ZN028)

+
+
\ No newline at end of file diff --git a/packages/readabilityjs/test/test-pages/caixin/source.html b/packages/readabilityjs/test/test-pages/caixin/source.html new file mode 100644 index 000000000..82ba1f9fa --- /dev/null +++ b/packages/readabilityjs/test/test-pages/caixin/source.html @@ -0,0 +1,2268 @@ + + + + + + + + + + + 途虎养车港交所挂牌 腾讯为最大外部股东_财新网_财新网 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+
+ + +
+
+
+ + +
财新传媒 + + +
+
+
+
+ 财新网 > 汽车 > 正文 +
+ +
+ +
+
+ +
+
+ + +
+ +
+ +
+
+ +
+ + + + +
+
+

+ 途虎养车港交所挂牌 腾讯为最大外部股东 +

+ +
+ +
+ 文|财新 余聪 +
+ 2023年09月26日 17:22 + + + 试听 +
+
+ 国内汽车服务市场高度分散,2022年,途虎养车取得汽车服务收入115亿元,市场份额0.9% +
+
+
+
+
+ +
+
+ 上海,一处途虎养车门店。途虎养车2022年1月即在港交所递表,2022年8月、2023年3月两次重新递交上市申请材料,终于在2023年8月23日通过聆讯。图:Qilai Shen/视觉中国 +
+
+
+
+ + +
+ + +
+

+   【财新网】9月26日,汽车服务平台途虎养车正式在港交所主板挂牌上市。途虎养车( 09690.HK )上市发行价为28港元/股,此前公司披露的发行价区间为28港元/股至31港元/股,即实际发行价为区间下限。当日,途虎养车收报29.5港元/股,较发行价涨5.36%,市值为239.6亿港元。 +

+

+   途虎养车上市不易。途虎养车2022年1月即在港交所递表,2022年8月、2023年3月两次重新递交上市申请材料,终于在2023年8月23日通过聆讯。 +

+
+
+
+
+ + +
+
+ +
+
+ + +
+ 登录 后获取已订阅的阅读权限 +
+
+
+ 财新通会员
+ 可畅读全文 +
订阅/会员升级 +
+
+
+
+
+
+ 请朋友免费读财新 +
+
+
+
+ +
+
+
+ + + + +
+
+
+ + +
+ +
+
+ +
+
+ +
+
+
+
+
+
+ +
+ +
+

+   推荐进入财新数据库,可随时查阅公司股价走势、结构人员变化等投资信息。 +

+
+
+
+ 责任编辑:屈运栩 | 版面编辑:刘潇(ZN028) +
+ + +
+ +
+ +
+
+ 话题: +
+
+ #港交所+关注 +
+
+ #腾讯+关注 +
+
+ #京东+关注 +
+
+
+ +
+ +
+ +
+ +
+
+ + + +
+ +
+ + + + +
+ + +
+ +
+ +
+

+ 图片推荐 +

+
+ + +
+ +
+
+ +
+
+ + + + +
+ +
+
+ +
+ +
+ +
+ +
+
+ +
+
+ + +
+
+
+ 财新网主编精选版电邮 + 样例 +
+
+ 财新网新闻版电邮全新升级!财新网主编精心编写,每个工作日定时投递,篇篇重磅,可信可引。 +
+
+ 订阅 +
+
+
+ + + +
+ + + + + + +
+

+ 视频 +

+
+ +
+
 + + + + +
+
+
+ +
+
+ + + + + + +
+ + + + + + + + + + + + + + + + +
+
+
+

+ +

+
+ +
+
+ +
+ + diff --git a/packages/readabilityjs/test/test-pages/caixin/url.txt b/packages/readabilityjs/test/test-pages/caixin/url.txt new file mode 100644 index 000000000..dbae9da78 --- /dev/null +++ b/packages/readabilityjs/test/test-pages/caixin/url.txt @@ -0,0 +1 @@ +https://www.caixin.com/2023-09-26/102112537.html \ No newline at end of file diff --git a/packages/readabilityjs/test/test-pages/economist/expected-metadata.json b/packages/readabilityjs/test/test-pages/economist/expected-metadata.json new file mode 100644 index 000000000..c2b5d1cdf --- /dev/null +++ b/packages/readabilityjs/test/test-pages/economist/expected-metadata.json @@ -0,0 +1,12 @@ +{ + "title": "Could the 14th Amendment bar Donald Trump from becoming president again?", + "byline": null, + "dir": null, + "excerpt": "Some conservative legal scholars think so—but the idea is a long shot", + "siteName": "The Economist", + "siteIcon": "http://fakehost/favicon.ico", + "previewImage": "https://www.economist.com/img/b/1280/720/90/media-assets/image/20230923_BLP505.jpg", + "publishedDate": "2023-09-19T16:00:00.000Z", + "language": "English", + "readerable": true +} diff --git a/packages/readabilityjs/test/test-pages/economist/expected.html b/packages/readabilityjs/test/test-pages/economist/expected.html new file mode 100644 index 000000000..9b66b5b0d --- /dev/null +++ b/packages/readabilityjs/test/test-pages/economist/expected.html @@ -0,0 +1,39 @@ +
+
+
+
+
+
+

Some conservative legal scholars think so—but the idea is a long shot 

+
+
+
+ Donald Trump speaks on the stage at South Dakota Republican party rally in Rapid City +
+ image: Reuters +
+
+
+
+
+

+ DONALD TRUMP’S campaign for re-election is dogged with legal woes. The former president faces the prospect of four criminal trials on felony charges, which will overlap with the Republican primary season and the general-election campaign. But another type of legal trouble could further complicate his return to the White House. +

+

America’s constitution—which Mr Trump swore to uphold on January 20th 2017—includes a provision barring people who have taken such an oath from holding federal office if they have “engaged in insurrection or rebellion” against the country or “given aid or comfort to the enemies thereof”. This language, found in Section 3 of the 14th Amendment, was ratified after the civil war to prevent former Confederate rebels from having a hand in running the country they had tried to saw in half. The disqualification clause has seen something of a renaissance. A year ago, Couy Griffin, then a county commissioner in New Mexico, was removed from office by a state judge for engaging in insurrection  at the Capitol on January 6th. But could this constitutional provision really thwart Mr Trump’s quest for a second presidential term?

+
+
+

The Economist today +

+

Handpicked stories, in your inbox

+

A daily newsletter with the best of our journalism

+
+

+

+
+
+
+
+
+
+
+
\ No newline at end of file diff --git a/packages/readabilityjs/test/test-pages/economist/source.html b/packages/readabilityjs/test/test-pages/economist/source.html new file mode 100644 index 000000000..33239ec9a --- /dev/null +++ b/packages/readabilityjs/test/test-pages/economist/source.html @@ -0,0 +1,1385 @@ + + + + + + + Could the 14th Amendment bar Donald Trump from becoming president again? + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+
+
+
+
+
+
+
+
+
+
+
+ + + The Economist + + + Skip to content +
+ +
+
+ Subscribe +
+ +
+
+ + +
+ +
+
+ +
+ +
+
+
+
+
+
+
+
+
+
+ +
+
+
+
+
+
+ +

+ Could the 14th Amendment bar Donald Trump from becoming president again? +

+

+ Some conservative legal scholars think so—but the idea is a long shot  +

+
+
+
+
+
+ Donald Trump speaks on the stage at South Dakota Republican party rally in Rapid City +
+ image: Reuters +
+
+
+
+
+
+
+
+
+
+ +
+
+
+
+
+ +
+
+
+
+
+
+
+
+

+ DONALD TRUMP’S campaign for re-election is dogged with legal woes. The former president faces the prospect of four criminal trials on felony charges, which will overlap with the Republican primary season and the general-election campaign. But another type of legal trouble could further complicate his return to the White House. +

+

+ America’s constitution—which Mr Trump swore to uphold on January 20th 2017—includes a provision barring people who have taken such an oath from holding federal office if they have “engaged in insurrection or rebellion” against the country or “given aid or comfort to the enemies thereof”. This language, found in Section 3 of the 14th Amendment, was ratified after the civil war to prevent former Confederate rebels from having a hand in running the country they had tried to saw in half. The disqualification clause has seen something of a renaissance. A year ago, Couy Griffin, then a county commissioner in New Mexico, was removed from office by a state judge for engaging in insurrection  at the Capitol on January 6th. But could this constitutional provision really thwart Mr Trump’s quest for a second presidential term? +

+
+
+
+ +
+
+
+ +
+
+
+
+
+
+
+ +
Reuse this content +
+
+
+
+
+
+
+
+ + + + + + + + + + + + + + + + + + + + +
+
+
+ The Economist today +
+

+ Handpicked stories, in your inbox +

+
+

+ A daily newsletter with the best of our journalism +

+
+
+
+
+
+ +
+
+
+ +
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+

+ More from The Economist explains +

+
+
+
+
+
+
+ +
+
+

+ What is America’s farm bill, and why does it matter? +

+

+ It has transformed the agriculture industry and given millions of Americans food security +

+
+
+
+
+
+
+ +
+
+

+ Why Poland is halting its supply of weapons to Ukraine +

+

+ A row over duty-free grain has escalated rapidly—but Poland’s government is also posturing +

+
+
+
+
+
+ +
+
+

+ What is Khalistan, the independent homeland some Sikhs yearn for? +

+

+ The separatist movement is now largely propagated from abroad +

+
+
+
+
+
+
+
+
+
+
+
+ +
+
+
+
+ + + + + + + + + + + + + +
+ +
+ + diff --git a/packages/readabilityjs/test/test-pages/economist/url.txt b/packages/readabilityjs/test/test-pages/economist/url.txt new file mode 100644 index 000000000..0e967ebe7 --- /dev/null +++ b/packages/readabilityjs/test/test-pages/economist/url.txt @@ -0,0 +1 @@ +https://www.economist.com/the-economist-explains/2023/09/20/could-the-14th-amendment-bar-donald-trump-from-becoming-president-again \ No newline at end of file