Commit Graph

182 Commits

Author SHA1 Message Date
6b57ab7a84 Update readability test for new _cleanElement name 2024-08-29 10:59:53 +08:00
43012d41f5 Update generated html 2024-08-13 08:58:53 +00:00
cc3cd87013 fix: navigation timeout when generating test case for raw story webpage 2024-08-13 16:57:48 +08:00
7d2e10ba5f Update generated html 2024-07-17 07:57:11 +00:00
34d8fc54e2 fix: substack headings are removed because its class name contains header 2024-07-17 15:56:09 +08:00
e0ad660f8f use first paragraph more than 50 characters as exceprt 2024-06-05 08:57:10 +08:00
f7c6a02c34 Update generated html 2024-06-01 03:21:08 +00:00
d74f0b7a98 add test 2024-06-01 11:20:12 +08:00
8a2ea4b0b1 Update generated html 2024-03-04 05:24:59 +00:00
0e46dc2302 save dir in the database 2024-03-04 12:28:51 +08:00
fd7c2ffb49 fix code blocks not formatted correctly in articles from wechat official accounts 2024-01-25 16:30:59 +08:00
04fd8d2e5d Update generated html 2024-01-25 16:30:59 +08:00
7ff327787b add test case 2024-01-25 16:30:59 +08:00
9922e0b2d8 Update generated html 2023-12-20 08:19:27 +00:00
1a371a80b1 remove hidden labels from substack post by readability 2023-12-20 16:18:25 +08:00
313fd77bef Update generated html 2023-10-20 18:59:22 +08:00
1b1cce7485 disable javascript for the host 2023-10-20 18:59:22 +08:00
3bd43048a4 add test case for forte labs newsletter 2023-10-20 18:59:22 +08:00
2b23b0e002 Update generated html 2023-09-28 09:37:48 +00:00
001403c02d fix tests 2023-09-28 17:36:27 +08:00
310ad5de1d get published date from url 2023-09-28 17:35:33 +08:00
45b7c2b619 get published date from time element 2023-09-28 17:14:17 +08:00
f0abdd654a Update generated html 2023-09-28 02:35:06 +00:00
55e274a32c better match of published date and avoid removing date string which is not a published date 2023-09-28 10:34:05 +08:00
0ccf332ab0 Update generated html 2023-09-27 07:33:55 +00:00
3399213328 add test cases from economist and caixin 2023-09-27 15:32:42 +08:00
53a6a5e6b9 Update generated html 2023-09-12 08:50:11 +00:00
e38411af33 Boost content length of emoji 2023-09-12 16:49:05 +08:00
f157063187 Update generated html 2023-08-16 03:58:00 +00:00
ad6ce8077b speically handle zhihu.com 2023-08-16 11:54:48 +08:00
51a2029f65 fix title not fetched correctly for some chinese websites 2023-08-16 10:45:48 +08:00
94b7399b1c Add points for any commas within this paragraph 2023-08-15 21:21:17 +08:00
d955e53fdd generate test page for zhihu 2023-08-15 11:15:19 +08:00
7641a2567e disable extensions too 2023-08-02 16:12:24 +08:00
4eab6ea6d2 remove hardware acceleration 2023-08-02 16:07:43 +08:00
a97fcd1e88 do not use single process in chromium 2023-08-02 15:58:32 +08:00
63cbb3011e upgrade puppeteer and update chromium args 2023-08-02 15:33:15 +08:00
837bea4913 Update generated html 2023-07-24 05:15:02 +00:00
2f0c830843 Improve readability for lesswrong.com 2023-07-24 13:13:50 +08:00
48ed5ec745 Hide webflow test elements 2023-07-24 12:35:18 +08:00
545e396d6e Update generated html 2023-07-24 12:21:51 +08:00
9d49b683f5 Add test for webflow page that includes an embedded textbox 2023-07-24 12:17:45 +08:00
e446634504 Update generated html 2023-06-26 08:41:23 +00:00
244fb4ccb5 fix: removing node with background image 2023-06-26 16:40:14 +08:00
90d41c5e85 Update generated html 2023-06-26 04:03:48 +00:00
a964c59d80 fix: missing links
* skip removing <a> elements with published date in the url
2023-06-26 11:50:50 +08:00
8d3db4161b Update generated html 2023-06-22 09:41:25 +00:00
9d1eb3bfe6 add testcase 2023-06-22 16:32:30 +08:00
2aa2d09cef Update generated html 2023-05-26 09:50:22 +00:00
141266461c add test case for lesswrong 2023-05-26 17:49:12 +08:00