Commit Graph

176 Commits

Author SHA1 Message Date
f7225b298a Rebase 2024-03-14 12:48:56 +08:00
29cb15de32 use nitter handler for twitter new domain x.com 2024-01-26 21:50:21 +08:00
e96458192d fix redis client cert in nitter handler 2024-01-23 19:30:38 +08:00
d9feb740cb convert content-fetch to typescript 2024-01-18 18:48:46 +08:00
cd3402b98a rewrite puppeteer in typescript 2024-01-18 18:48:46 +08:00
150a456c35 replace redis client library with ioredis 2024-01-16 15:42:50 +08:00
c4773dc904 Landing page improvements and various supporting improvements 2023-10-24 09:43:39 +01:00
38a1290b2a Use scrapingbee for fool.ca 2023-08-30 12:13:52 +08:00
ad6ce8077b speically handle zhihu.com 2023-08-16 11:54:48 +08:00
053625276b Merge pull request #2608 from Podginator/feat/Ars-technica-handler
Ars Technica Multipage handling
2023-08-07 10:43:39 +08:00
0392f6f009 Linting Fixes 2023-08-04 22:36:15 +02:00
49c1976ac9 Add Ars Technica handler for multipage articles 2023-08-04 20:00:03 +02:00
cf5f46026a Add Paywalled Wired Handler 2023-08-02 19:53:20 +02:00
3146b3bcc5 remove redundant logs 2023-07-27 17:55:17 +08:00
a3fa695957 add comment 2023-07-04 20:47:51 +08:00
a818fa721c use asia/shanghai timezone to format published date in wechat articles 2023-07-04 20:46:39 +08:00
43cb72db97 comment 2023-07-04 19:02:47 +08:00
e06f833e21 fix: missing published date in some wechat articles
* some wechat articles do not have `create_time` in the dom javascript so use the `publish_time` node in the dom content
2023-07-04 18:46:35 +08:00
607e22ce94 fix: tweet with no text failed to save 2023-06-26 21:46:52 +08:00
937be20928 fix: tweet format 2023-06-26 21:11:04 +08:00
15c964481a fix: tweet not saved
* compare username ignoring case
2023-06-26 18:07:47 +08:00
978a21d160 fix typo 2023-06-23 16:59:34 +08:00
a5ddc9aced lowercase username before comparing 2023-06-23 16:56:03 +08:00
41f0c5b3a3 skip saving unavailable and earlier replies 2023-06-23 16:26:07 +08:00
4b896dd56a update instances 2023-06-23 14:30:13 +08:00
918585dfb6 get the redis connection info from env 2023-06-23 14:08:20 +08:00
e2b66be75d fix dependecies 2023-06-23 13:33:38 +08:00
716fdf9a61 expire key after 1 day 2023-06-23 13:19:18 +08:00
fc0257b480 sort by descending order 2023-06-23 12:52:27 +08:00
b746046727 update default score 2023-06-23 12:34:31 +08:00
9ef40b3ed9 deduct latency from the score of the member in sorted set 2023-06-23 12:27:34 +08:00
ae504a970b save instances in redis as a sorted set 2023-06-23 12:17:41 +08:00
4607754f58 reduce timeout to 20s 2023-06-23 11:36:30 +08:00
6b203dde4c update instances pool to pick one from each country 2023-06-23 10:54:43 +08:00
61542a34bd fix: create an instance pool to scrape nitter 2023-06-23 10:45:37 +08:00
30b7c38e76 fix: increase twitter-handler timeout to 60 seconds 2023-06-22 21:16:55 +08:00
12009d1d06 fix: tweet author not saved 2023-06-22 18:05:53 +08:00
0a0d64ca4e Merge pull request #2409 from omnivore-app/fix/nitter-scraper
using axios to fetch html from nitter.net
2023-06-22 15:01:28 +08:00
3e2e16e915 skip running readability on scraped tweet 2023-06-22 15:00:54 +08:00
0d7adcdecc using axios to fetch html from nitter.net 2023-06-22 14:32:18 +08:00
a0556055c5 Merge pull request #2389 from Podginator/feat/atlantic-handler
Add The Atlantic handler to avoid paywall and correctly format
2023-06-22 10:21:13 +08:00
de2ca58b37 Linting fix 2023-06-22 09:52:43 +08:00
6aceadc819 fix attachment url 2023-06-21 17:46:05 +08:00
f0e6458e4a filter out replies 2023-06-21 16:01:09 +08:00
a3ad9375bf replace url in tweets 2023-06-21 15:50:55 +08:00
d2094981ad increase timeout value to 60 seconds 2023-06-21 15:21:40 +08:00
dde104b95e fix video attachment not saved 2023-06-21 15:07:21 +08:00
a109d0383f go to the next thread page to scrape long thread 2023-06-20 19:17:34 +08:00
89f89875d6 fix date parsing 2023-06-20 17:35:59 +08:00
7293078068 Remove Audio section for Feature Articles 2023-06-20 10:35:04 +02:00