|
|
f87ecbcf18
|
Merge pull request #4207 from omnivore-app/fix/published-date-change-in-wechat
fix/published date change in wechat
|
2024-07-23 13:58:17 +08:00 |
|
|
|
d9b088cc01
|
fix tests
|
2024-07-18 12:22:21 +08:00 |
|
|
|
f2b3a66b72
|
remove metadata and cover image from content
|
2024-07-18 11:18:51 +08:00 |
|
|
|
34d8fc54e2
|
fix: substack headings are removed because its class name contains header
|
2024-07-17 15:56:09 +08:00 |
|
|
|
e0ad660f8f
|
use first paragraph more than 50 characters as exceprt
|
2024-06-05 08:57:10 +08:00 |
|
|
|
9316f8a6e1
|
fix: use article's first long paragraph as excerpt if description not found
|
2024-06-04 20:11:11 +08:00 |
|
|
|
d74f0b7a98
|
add test
|
2024-06-01 11:20:12 +08:00 |
|
|
|
02ba76f87d
|
fix: fail to display the thumbnail of some webpages
* The regex we use is not strict and it will match og:image:type instead of og:image if there is one
|
2024-06-01 10:08:07 +08:00 |
|
|
|
4d0f1bec88
|
Add support for embedding TikTok videos
|
2024-05-13 13:30:54 +08:00 |
|
|
|
e031f4f81c
|
deprioritize jsonld preview image
|
2024-03-19 15:51:20 +08:00 |
|
|
|
4cfe4f95ae
|
Remove bottom-wrapper element which is added on NYT and some other sites RSS feeds
|
2024-02-20 11:26:21 +08:00 |
|
|
|
f33ff113db
|
Improve readability for the verge
|
2024-02-15 10:57:36 +08:00 |
|
|
|
d5d581fc54
|
remove redundant logs
|
2024-01-29 14:32:43 +08:00 |
|
|
|
fd7c2ffb49
|
fix code blocks not formatted correctly in articles from wechat official accounts
|
2024-01-25 16:30:59 +08:00 |
|
|
|
dae90aaffa
|
fix code blocks not formatted correctly
|
2024-01-25 16:30:59 +08:00 |
|
|
|
b3f052860a
|
Merge pull request #3175 from omnivore-app/fix/detect-language
detect language from html content
|
2024-01-03 17:50:01 +08:00 |
|
|
|
1a371a80b1
|
remove hidden labels from substack post by readability
|
2023-12-20 16:18:25 +08:00 |
|
|
|
77110640a1
|
detect language from content
|
2023-11-27 23:09:20 +08:00 |
|
|
|
10a21adc33
|
detect language from content if locale not found in metadata
|
2023-11-27 22:54:39 +08:00 |
|
|
|
310ad5de1d
|
get published date from url
|
2023-09-28 17:35:33 +08:00 |
|
|
|
45b7c2b619
|
get published date from time element
|
2023-09-28 17:14:17 +08:00 |
|
|
|
55e274a32c
|
better match of published date and avoid removing date string which is not a published date
|
2023-09-28 10:34:05 +08:00 |
|
|
|
3399213328
|
add test cases from economist and caixin
|
2023-09-27 15:32:42 +08:00 |
|
|
|
60b7d500a2
|
fix long published date not parsed correctly
|
2023-09-26 21:41:27 +08:00 |
|
|
|
d37cb7fda1
|
fix published date in chinese not parsed correctly
|
2023-09-26 20:48:01 +08:00 |
|
|
|
e38411af33
|
Boost content length of emoji
|
2023-09-12 16:49:05 +08:00 |
|
|
|
08dbe2dead
|
Handle the ignore density check in the getLinkDensity function
|
2023-09-12 16:04:58 +08:00 |
|
|
|
293becf596
|
Ignore link density checks in newsletters
|
2023-09-12 15:53:43 +08:00 |
|
|
|
51a2029f65
|
fix title not fetched correctly for some chinese websites
|
2023-08-16 10:45:48 +08:00 |
|
|
|
3119471d1c
|
do not remove QuestionHeader
|
2023-08-15 21:31:19 +08:00 |
|
|
|
94b7399b1c
|
Add points for any commas within this paragraph
|
2023-08-15 21:21:17 +08:00 |
|
|
|
2f0c830843
|
Improve readability for lesswrong.com
|
2023-07-24 13:13:50 +08:00 |
|
|
|
48ed5ec745
|
Hide webflow test elements
|
2023-07-24 12:35:18 +08:00 |
|
|
|
a964c59d80
|
fix: missing links
* skip removing <a> elements with published date in the url
|
2023-06-26 11:50:50 +08:00 |
|
|
|
2fbee1e831
|
Remove webflow invisible elements
|
2023-04-28 19:57:13 +08:00 |
|
|
|
fbb638619c
|
Mark the related stories and social buttons as unlikely candidates
|
2023-04-19 17:04:01 +08:00 |
|
|
|
add54b1e35
|
For lazy loaded images use their lazy src as the src URL
|
2023-04-11 10:58:06 +08:00 |
|
|
|
eb58bf11ba
|
Force to use content handler of piped.video when saving from extensions
|
2023-04-10 20:52:09 +08:00 |
|
|
|
deff73953a
|
Do not delete embeded iframe of piped video
|
2023-04-06 16:30:52 +08:00 |
|
|
|
2378abef4a
|
Merge pull request #1962 from omnivore-app/fix/newline-in-author
Remove \n, extra spaces from and trim author
|
2023-03-31 10:21:42 +08:00 |
|
|
|
f77aae9810
|
Remove \n, extra spaces from and trim author
|
2023-03-30 21:55:41 +08:00 |
|
|
|
db687f151b
|
Strip the tl_article_header element
|
2023-03-30 19:31:43 +08:00 |
|
|
|
895e50201a
|
Fix tests
|
2023-03-15 19:36:53 +08:00 |
|
|
|
aeb09539cc
|
Fallback to hostname
|
2023-03-15 13:24:17 +08:00 |
|
|
|
aae6759bcb
|
return published date if the class name is omnivore-published-date which we added when we scraped the article
|
2023-03-13 12:08:01 +08:00 |
|
|
|
1b58804547
|
Add points for any commas (including those in CJK language)
|
2023-02-15 17:12:28 +08:00 |
|
|
|
fc0bbe391a
|
Merge pull request #1805 from omnivore-app/fix/content-parsing
fix/content parsing
|
2023-02-14 14:15:46 +08:00 |
|
|
|
cc8b1cefdb
|
Preserve <pre> elements with prism- class and identity them as code blocks
|
2023-02-14 12:33:59 +08:00 |
|
|
|
9fc77c62d6
|
Merge pull request #1795 from omnivore-app/feat/fallback-urls-for-images
Add the original URL as a fallback when creating URL proxys
|
2023-02-13 16:55:24 +08:00 |
|
|
|
6b4c34bec1
|
Add wechat test page
|
2023-02-10 13:57:21 +08:00 |
|