Commit Graph

12 Commits

Author SHA1 Message Date
12af64609c Fix readability issues with null style elements
isProbably visible can fail in this case because style could be
undefined on an element.
2022-03-23 13:35:00 -07:00
960a22d50c Fix/city journal parsing (#266)
* remove arrow image when parsing

* ignore m_article classname element which indicates a mobile version of the website

* generate test page for city journal
2022-03-21 22:53:21 +08:00
0361ef86fa Better handling of HTML entities in descriptions
The HTML code method didnt implent all possible
entities, causing some (usually rquote) to display.
2022-03-14 11:02:08 -07:00
fc7d972855 Fix typo in readability date handling causing this parse issue
Can remove our special handler for the published date now that we
are pulling it out correctly.
2022-03-14 10:20:19 -07:00
8a2bb0f49d Handle blogger sites that display the full feed on the article page 2022-03-10 13:48:39 -08:00
234dba9174 Improve readability of channelnewsasia
This uses negative lookahead to reject nodes that have outstream
ads embedded. Previously they were being accepted because they
contained `$article` in the class name.
2022-03-04 10:34:44 -08:00
fc9aa9452c Add a flag in readability to retain table elements in newsletter emails (#152)
* add a flag in readability to retain table elements in newsletter emails

* remove header of axios newsletters
2022-03-01 11:49:38 +08:00
bcb90dac49 Make sure we strictly interpret img width/height values 2022-02-18 18:31:06 -08:00
603e9683db Better naming for this parsing function 2022-02-18 15:34:38 -08:00
8756a52a54 Call correct parsing method 2022-02-18 14:53:24 -08:00
612822f5a3 Handle non-number size attributes in images
The previous code assumed we would always have number size values
for width and height, but attributes like 100% are valid. In
cases where we don't have a numeric value we can just fallback
and let the item be sized by the reader CSS.
2022-02-18 14:18:36 -08:00
84f32935f5 Open source omnivore 2022-02-11 09:24:33 -08:00