Docunext


Website fingerprints

November 15th, 2006
wget -rq --accept=.html http://www.domain.com/ && find * -exec wget -q http://{} -O - \; > combined.www.domain.com.txt && \

rm -rf www.domain.com/ && md5sum combined.www.domain.com.txt | diff - combined.www.domain.com.txt.md5sum

The above command it a great way to get a public website's "fingerprint", meaning a relatively quick way to sum up the entire site. I think I will add a database field for each website that I manage. This will allow a process to check the site's integrity, and ensure that everything is working. That, along with a link checker to ensure that all other resources load, will ensure the site's proper display.

Yearly Indexes: 2003 2004 2006 2007 2008 2009 2010 2011 2012 2013 2015 2019 2020 2022