6d405432af
add site_name and site_icon to page model and return in resolver ( #341 )
...
* add site_name and site_icon to page model and return in resolver
* fix tests
2022-03-30 10:43:10 +08:00
12af64609c
Fix readability issues with null style elements
...
isProbably visible can fail in this case because style could be
undefined on an element.
2022-03-23 13:35:00 -07:00
960a22d50c
Fix/city journal parsing ( #266 )
...
* remove arrow image when parsing
* ignore m_article classname element which indicates a mobile version of the website
* generate test page for city journal
2022-03-21 22:53:21 +08:00
0361ef86fa
Better handling of HTML entities in descriptions
...
The HTML code method didnt implent all possible
entities, causing some (usually rquote) to display.
2022-03-14 11:02:08 -07:00
fc7d972855
Fix typo in readability date handling causing this parse issue
...
Can remove our special handler for the published date now that we
are pulling it out correctly.
2022-03-14 10:20:19 -07:00
8a2bb0f49d
Handle blogger sites that display the full feed on the article page
2022-03-10 13:48:39 -08:00
234dba9174
Improve readability of channelnewsasia
...
This uses negative lookahead to reject nodes that have outstream
ads embedded. Previously they were being accepted because they
contained `$article` in the class name.
2022-03-04 10:34:44 -08:00
84f32935f5
Open source omnivore
2022-02-11 09:24:33 -08:00