Commit Graph

8 Commits

Author SHA1 Message Date
6d405432af add site_name and site_icon to page model and return in resolver (#341)
* add site_name and site_icon to page model and return in resolver

* fix tests
2022-03-30 10:43:10 +08:00
12af64609c Fix readability issues with null style elements
isProbably visible can fail in this case because style could be
undefined on an element.
2022-03-23 13:35:00 -07:00
960a22d50c Fix/city journal parsing (#266)
* remove arrow image when parsing

* ignore m_article classname element which indicates a mobile version of the website

* generate test page for city journal
2022-03-21 22:53:21 +08:00
0361ef86fa Better handling of HTML entities in descriptions
The HTML code method didnt implent all possible
entities, causing some (usually rquote) to display.
2022-03-14 11:02:08 -07:00
fc7d972855 Fix typo in readability date handling causing this parse issue
Can remove our special handler for the published date now that we
are pulling it out correctly.
2022-03-14 10:20:19 -07:00
8a2bb0f49d Handle blogger sites that display the full feed on the article page 2022-03-10 13:48:39 -08:00
234dba9174 Improve readability of channelnewsasia
This uses negative lookahead to reject nodes that have outstream
ads embedded. Previously they were being accepted because they
contained `$article` in the class name.
2022-03-04 10:34:44 -08:00
84f32935f5 Open source omnivore 2022-02-11 09:24:33 -08:00