Docunext


Hpricot is an HTML Parser that uses Ragel and is Written in Ruby

March 8th, 2009

After reading up on Ragel, I came across Hpricot, "a fast, flexible HTML parser written in C". I read a little bit more, and it sounds like Hpricot is a mix of jQuery selectors, XPath, and DOM dot paths. Sounds very cool!

But just how fast is it? And can it be separated from Ruby? Should it be? It looks good, it appears that the ragel stuff has been completely separated from the ruby stuff, which makes it easier for me to examine.

I'm confused by the fact that there is java code in the repository. Hmmm.

More notes at the Hpricot Docunext Wikipage.

Yearly Indexes: 2003 2004 2006 2007 2008 2009 2010 2011 2012 2013 2015 2019 2020 2022