|
|
87b4ec503e
|
enqueue content-fetch task to the queue
|
2024-08-21 12:24:35 +08:00 |
|
|
|
e3eae1c96c
|
create a worker to process content-fetch job
|
2024-08-21 12:24:35 +08:00 |
|
|
|
4674321531
|
reduce blocking domain to 1 hour
|
2024-08-18 12:37:10 +08:00 |
|
|
|
322f736fe0
|
stop storing original html in the database
|
2024-07-31 19:14:38 +08:00 |
|
|
|
0e0c4bddac
|
block failed domains
|
2024-07-24 16:55:50 +08:00 |
|
|
|
31fe4b65a0
|
remove readability from content-fetch
|
2024-07-24 12:53:41 +08:00 |
|
|
|
29a5b20d2c
|
remove scrapingbee from content-fetch
|
2024-07-24 12:17:13 +08:00 |
|
|
|
75338f5927
|
bypass cloudflare captcha
|
2024-07-10 14:43:47 +08:00 |
|
|
|
73e180f43d
|
add more dependencies to docker container
|
2024-07-09 19:16:21 +08:00 |
|
|
|
c75cbb39d6
|
injecting webgl fingerprint
|
2024-07-09 14:11:31 +08:00 |
|
|
|
dd01202374
|
do not cache some urls
|
2024-07-05 19:46:18 +08:00 |
|
|
|
728059c6f8
|
do not cache some urls
|
2024-07-05 19:05:36 +08:00 |
|
|
|
b38b28c75e
|
create a browser singleton instance and checks browser existence before creating context
|
2024-07-04 19:12:42 +08:00 |
|
|
|
bbc7b5e600
|
use @omnivore/utils in import-handler
|
2024-07-03 22:20:27 +08:00 |
|
|
|
59c826fd5e
|
use @omnivore/utils in content-fetch
|
2024-07-03 21:58:22 +08:00 |
|
|
|
f2ff4b7b0a
|
fix: only send content_fetch_failure event to analytics
|
2024-05-31 12:44:01 +08:00 |
|
|
|
fc9d5c64ec
|
do not fail if cache missed
|
2024-05-17 17:27:34 +08:00 |
|
|
|
6f2aa2e0cd
|
add more logs
|
2024-05-17 17:19:55 +08:00 |
|
|
|
52ebf466e3
|
get content from cache first when saving url
|
2024-05-17 16:46:54 +08:00 |
|
|
|
9c3d619ad5
|
put locale and timezone in cache key
|
2024-05-17 16:22:20 +08:00 |
|
|
|
dde9f16396
|
put error message in the analytic event
|
2024-05-17 16:16:44 +08:00 |
|
|
|
f3ce6f4d4e
|
catch content fetch result in redis
|
2024-05-17 15:55:28 +08:00 |
|
|
|
efb9b6b139
|
add source to the content_fetch event
|
2024-05-17 14:54:46 +08:00 |
|
|
|
9dee510be1
|
fix rss
|
2024-05-14 20:18:18 +08:00 |
|
|
|
cce5f2463d
|
still use redis for cache
|
2024-05-14 17:16:26 +08:00 |
|
|
|
04ba62977e
|
fix rebase conflicts
|
2024-05-14 17:14:41 +08:00 |
|
|
|
e093c9e096
|
fix comment
|
2024-05-14 17:14:41 +08:00 |
|
|
|
3e925e0193
|
update comment
|
2024-05-14 17:14:41 +08:00 |
|
|
|
5bd157ca25
|
hash url as the key
|
2024-05-14 17:14:41 +08:00 |
|
|
|
7a0b2f3d33
|
upload file only not exists
|
2024-05-14 17:14:41 +08:00 |
|
|
|
9286174ec7
|
upload and download original content from GCS
|
2024-05-14 17:14:40 +08:00 |
|
|
|
33e1c4dd00
|
remove flush method from analytics class
|
2024-05-13 19:10:14 +08:00 |
|
|
|
7634ed667f
|
capture total time of fetching a page
|
2024-05-13 17:01:52 +08:00 |
|
|
|
f64bd4700f
|
update analytic event details
|
2024-05-13 15:18:04 +08:00 |
|
|
|
a924c8448b
|
capture content-fetch success and error events
|
2024-05-13 14:55:48 +08:00 |
|
|
|
0c0a95a79c
|
fix newsletter dir not saved correctly
|
2024-04-24 21:10:13 +08:00 |
|
|
|
824b256d20
|
fix memory leak from axios error
|
2024-04-24 15:55:54 +08:00 |
|
|
|
7f441b4ff3
|
dedupe save-page job
|
2024-04-23 21:44:25 +08:00 |
|
|
|
88a7e8d85b
|
fix tests
|
2024-04-04 12:17:15 +08:00 |
|
|
|
927394e07c
|
fix: save url operation is delayed
|
2024-02-21 17:54:08 +08:00 |
|
|
|
819135f118
|
Label the content-fetch and web images
|
2024-02-21 14:16:21 +08:00 |
|
|
|
856a7bc5ef
|
Add health checks to content-fetch
|
2024-02-12 14:42:51 +08:00 |
|
|
|
86f4553dd1
|
fix: duplicate key value violates unique constraint "library_item_pkey"
|
2024-02-07 20:41:34 +08:00 |
|
|
|
9a342f273c
|
Increase delay on save page retry
|
2024-02-06 12:41:11 +08:00 |
|
|
|
9e6b5d2dcf
|
Backoff for save page
|
2024-02-03 15:35:08 +08:00 |
|
|
|
fd80724d79
|
fallback to guid if link of rss item is not available
|
2024-01-26 22:09:53 +08:00 |
|
|
|
52419c39da
|
resolve conflicts
|
2024-01-26 13:01:53 +08:00 |
|
|
|
e4e07c8acf
|
cache fetched content for 24 hours
|
2024-01-26 13:01:53 +08:00 |
|
|
|
7e89235806
|
update content-fetch
|
2024-01-26 13:01:53 +08:00 |
|
|
|
5e239d2568
|
run readability in save-page instead of puppeteer
|
2024-01-25 16:30:59 +08:00 |
|