Commit Graph

254 Commits

Author SHA1 Message Date
e031f4f81c deprioritize jsonld preview image 2024-03-19 15:51:20 +08:00
8a2ea4b0b1 Update generated html 2024-03-04 05:24:59 +00:00
0e46dc2302 save dir in the database 2024-03-04 12:28:51 +08:00
4cfe4f95ae Remove bottom-wrapper element which is added on NYT and some other sites RSS feeds 2024-02-20 11:26:21 +08:00
f33ff113db Improve readability for the verge 2024-02-15 10:57:36 +08:00
d5d581fc54 remove redundant logs 2024-01-29 14:32:43 +08:00
fd7c2ffb49 fix code blocks not formatted correctly in articles from wechat official accounts 2024-01-25 16:30:59 +08:00
04fd8d2e5d Update generated html 2024-01-25 16:30:59 +08:00
7ff327787b add test case 2024-01-25 16:30:59 +08:00
dae90aaffa fix code blocks not formatted correctly 2024-01-25 16:30:59 +08:00
b3f052860a Merge pull request #3175 from omnivore-app/fix/detect-language
detect language from html content
2024-01-03 17:50:01 +08:00
9922e0b2d8 Update generated html 2023-12-20 08:19:27 +00:00
1a371a80b1 remove hidden labels from substack post by readability 2023-12-20 16:18:25 +08:00
77110640a1 detect language from content 2023-11-27 23:09:20 +08:00
10a21adc33 detect language from content if locale not found in metadata 2023-11-27 22:54:39 +08:00
c4773dc904 Landing page improvements and various supporting improvements 2023-10-24 09:43:39 +01:00
313fd77bef Update generated html 2023-10-20 18:59:22 +08:00
1b1cce7485 disable javascript for the host 2023-10-20 18:59:22 +08:00
3bd43048a4 add test case for forte labs newsletter 2023-10-20 18:59:22 +08:00
2b23b0e002 Update generated html 2023-09-28 09:37:48 +00:00
001403c02d fix tests 2023-09-28 17:36:27 +08:00
310ad5de1d get published date from url 2023-09-28 17:35:33 +08:00
45b7c2b619 get published date from time element 2023-09-28 17:14:17 +08:00
f0abdd654a Update generated html 2023-09-28 02:35:06 +00:00
55e274a32c better match of published date and avoid removing date string which is not a published date 2023-09-28 10:34:05 +08:00
0ccf332ab0 Update generated html 2023-09-27 07:33:55 +00:00
3399213328 add test cases from economist and caixin 2023-09-27 15:32:42 +08:00
60b7d500a2 fix long published date not parsed correctly 2023-09-26 21:41:27 +08:00
d37cb7fda1 fix published date in chinese not parsed correctly 2023-09-26 20:48:01 +08:00
53a6a5e6b9 Update generated html 2023-09-12 08:50:11 +00:00
e38411af33 Boost content length of emoji 2023-09-12 16:49:05 +08:00
08dbe2dead Handle the ignore density check in the getLinkDensity function 2023-09-12 16:04:58 +08:00
293becf596 Ignore link density checks in newsletters 2023-09-12 15:53:43 +08:00
f157063187 Update generated html 2023-08-16 03:58:00 +00:00
ad6ce8077b speically handle zhihu.com 2023-08-16 11:54:48 +08:00
51a2029f65 fix title not fetched correctly for some chinese websites 2023-08-16 10:45:48 +08:00
3119471d1c do not remove QuestionHeader 2023-08-15 21:31:19 +08:00
94b7399b1c Add points for any commas within this paragraph 2023-08-15 21:21:17 +08:00
d955e53fdd generate test page for zhihu 2023-08-15 11:15:19 +08:00
7641a2567e disable extensions too 2023-08-02 16:12:24 +08:00
4eab6ea6d2 remove hardware acceleration 2023-08-02 16:07:43 +08:00
a97fcd1e88 do not use single process in chromium 2023-08-02 15:58:32 +08:00
63cbb3011e upgrade puppeteer and update chromium args 2023-08-02 15:33:15 +08:00
837bea4913 Update generated html 2023-07-24 05:15:02 +00:00
2f0c830843 Improve readability for lesswrong.com 2023-07-24 13:13:50 +08:00
48ed5ec745 Hide webflow test elements 2023-07-24 12:35:18 +08:00
545e396d6e Update generated html 2023-07-24 12:21:51 +08:00
9d49b683f5 Add test for webflow page that includes an embedded textbox 2023-07-24 12:17:45 +08:00
e446634504 Update generated html 2023-06-26 08:41:23 +00:00
244fb4ccb5 fix: removing node with background image 2023-06-26 16:40:14 +08:00