|
|
c4773dc904
|
Landing page improvements and various supporting improvements
|
2023-10-24 09:43:39 +01:00 |
|
|
|
38a1290b2a
|
Use scrapingbee for fool.ca
|
2023-08-30 12:13:52 +08:00 |
|
|
|
ad6ce8077b
|
speically handle zhihu.com
|
2023-08-16 11:54:48 +08:00 |
|
|
|
053625276b
|
Merge pull request #2608 from Podginator/feat/Ars-technica-handler
Ars Technica Multipage handling
|
2023-08-07 10:43:39 +08:00 |
|
|
|
0392f6f009
|
Linting Fixes
|
2023-08-04 22:36:15 +02:00 |
|
|
|
49c1976ac9
|
Add Ars Technica handler for multipage articles
|
2023-08-04 20:00:03 +02:00 |
|
|
|
cf5f46026a
|
Add Paywalled Wired Handler
|
2023-08-02 19:53:20 +02:00 |
|
|
|
3146b3bcc5
|
remove redundant logs
|
2023-07-27 17:55:17 +08:00 |
|
|
|
a3fa695957
|
add comment
|
2023-07-04 20:47:51 +08:00 |
|
|
|
a818fa721c
|
use asia/shanghai timezone to format published date in wechat articles
|
2023-07-04 20:46:39 +08:00 |
|
|
|
43cb72db97
|
comment
|
2023-07-04 19:02:47 +08:00 |
|
|
|
e06f833e21
|
fix: missing published date in some wechat articles
* some wechat articles do not have `create_time` in the dom javascript so use the `publish_time` node in the dom content
|
2023-07-04 18:46:35 +08:00 |
|
|
|
607e22ce94
|
fix: tweet with no text failed to save
|
2023-06-26 21:46:52 +08:00 |
|
|
|
937be20928
|
fix: tweet format
|
2023-06-26 21:11:04 +08:00 |
|
|
|
15c964481a
|
fix: tweet not saved
* compare username ignoring case
|
2023-06-26 18:07:47 +08:00 |
|
|
|
978a21d160
|
fix typo
|
2023-06-23 16:59:34 +08:00 |
|
|
|
a5ddc9aced
|
lowercase username before comparing
|
2023-06-23 16:56:03 +08:00 |
|
|
|
41f0c5b3a3
|
skip saving unavailable and earlier replies
|
2023-06-23 16:26:07 +08:00 |
|
|
|
4b896dd56a
|
update instances
|
2023-06-23 14:30:13 +08:00 |
|
|
|
918585dfb6
|
get the redis connection info from env
|
2023-06-23 14:08:20 +08:00 |
|
|
|
e2b66be75d
|
fix dependecies
|
2023-06-23 13:33:38 +08:00 |
|
|
|
716fdf9a61
|
expire key after 1 day
|
2023-06-23 13:19:18 +08:00 |
|
|
|
fc0257b480
|
sort by descending order
|
2023-06-23 12:52:27 +08:00 |
|
|
|
b746046727
|
update default score
|
2023-06-23 12:34:31 +08:00 |
|
|
|
9ef40b3ed9
|
deduct latency from the score of the member in sorted set
|
2023-06-23 12:27:34 +08:00 |
|
|
|
ae504a970b
|
save instances in redis as a sorted set
|
2023-06-23 12:17:41 +08:00 |
|
|
|
4607754f58
|
reduce timeout to 20s
|
2023-06-23 11:36:30 +08:00 |
|
|
|
6b203dde4c
|
update instances pool to pick one from each country
|
2023-06-23 10:54:43 +08:00 |
|
|
|
61542a34bd
|
fix: create an instance pool to scrape nitter
|
2023-06-23 10:45:37 +08:00 |
|
|
|
30b7c38e76
|
fix: increase twitter-handler timeout to 60 seconds
|
2023-06-22 21:16:55 +08:00 |
|
|
|
12009d1d06
|
fix: tweet author not saved
|
2023-06-22 18:05:53 +08:00 |
|
|
|
0a0d64ca4e
|
Merge pull request #2409 from omnivore-app/fix/nitter-scraper
using axios to fetch html from nitter.net
|
2023-06-22 15:01:28 +08:00 |
|
|
|
3e2e16e915
|
skip running readability on scraped tweet
|
2023-06-22 15:00:54 +08:00 |
|
|
|
0d7adcdecc
|
using axios to fetch html from nitter.net
|
2023-06-22 14:32:18 +08:00 |
|
|
|
a0556055c5
|
Merge pull request #2389 from Podginator/feat/atlantic-handler
Add The Atlantic handler to avoid paywall and correctly format
|
2023-06-22 10:21:13 +08:00 |
|
|
|
de2ca58b37
|
Linting fix
|
2023-06-22 09:52:43 +08:00 |
|
|
|
6aceadc819
|
fix attachment url
|
2023-06-21 17:46:05 +08:00 |
|
|
|
f0e6458e4a
|
filter out replies
|
2023-06-21 16:01:09 +08:00 |
|
|
|
a3ad9375bf
|
replace url in tweets
|
2023-06-21 15:50:55 +08:00 |
|
|
|
d2094981ad
|
increase timeout value to 60 seconds
|
2023-06-21 15:21:40 +08:00 |
|
|
|
dde104b95e
|
fix video attachment not saved
|
2023-06-21 15:07:21 +08:00 |
|
|
|
a109d0383f
|
go to the next thread page to scrape long thread
|
2023-06-20 19:17:34 +08:00 |
|
|
|
89f89875d6
|
fix date parsing
|
2023-06-20 17:35:59 +08:00 |
|
|
|
7293078068
|
Remove Audio section for Feature Articles
|
2023-06-20 10:35:04 +02:00 |
|
|
|
44dd44dd58
|
update comments on tests
|
2023-06-20 10:26:53 +02:00 |
|
|
|
c0f34270ee
|
Remove Related content links from article (IE: Read article X)
|
2023-06-20 10:24:41 +02:00 |
|
|
|
55fd02b67a
|
Fix Tests
|
2023-06-20 09:20:43 +02:00 |
|
|
|
ff9fcef08d
|
disable twitter-handler
|
2023-06-20 14:59:14 +08:00 |
|
|
|
65758f3469
|
feat: add nitter handler to scrape tweets from twitter.com and nitter.net
|
2023-06-20 14:56:54 +08:00 |
|
|
|
8e13284795
|
Linting fixes
|
2023-06-20 13:48:18 +08:00 |
|