* remove arrow image when parsing
* ignore m_article classname element which indicates a mobile version of the website
* generate test page for city journal
This uses negative lookahead to reject nodes that have outstream
ads embedded. Previously they were being accepted because they
contained `$article` in the class name.
The previous code assumed we would always have number size values
for width and height, but attributes like 100% are valid. In
cases where we don't have a numeric value we can just fallback
and let the item be sized by the reader CSS.