Helpful Tip About Pulling From Upstream With Git

| No Comments

I found this post very helpful:

easily fetching upstream changes @ gitready.com

In a nutshell (for my own reference):

git remote add upstream git://github.com/user/repo.git
[alias]
  pu = !"git fetch origin -v; git fetch upstream -v; git merge upstream/master"

Git is really impressing me these days. Unfortunately for subversion, its showing me things I didn't even know about the ol' reliable svn...

After trying this on docu-not-git-wiki (a new branch to inspect minad's fork), I'm getting this error:

fatal: You have not concluded your merge. (MERGE_HEAD exists)

I'm not sure how I fixed this, but I fiddled around with git reset HEAD git-wiki.rb and manually edited files, and then did:

git commit -m "ok" -a

It seems to be fine, as I pushed the new branch and its now available on github.

And lastly, to push tags to github:

git push --tags

UPDATE: I'm not going to use that shortcut again anytime soon! It got me started with the idea, but obviously I don't understand the process well enough yet. My second attempt at creating a branch and merging upstream changes with it resulted in the entire history getting applied to both the branch and my mainline branch. Ugh! Thankfully docu-not-git-wiki is fairly simple. :-)

Duh, yeah. I just looked at the alias again and it does indeed merge with master. Doh!

Extrapolating Automation is Beautiful Empowerment

| No Comments

I've been using Rake a lot lately, and today I had a nice "epiphany" about automation - that its not about just setting up a bunch of tasks and writing shortcuts to them.

Automation can get really sophisticated with tools like Rake. This isn't a new concept to me, but its practical applicability is.

Its truly the beautiful empowerment that comes along with programming.

Beautiful Empowerment

Beautiful Empowerment... sounds cool. But what do I mean by that? Like it sounds, its a bit of a smoke and mirrors, but only mostly the real deal. Its the capability to leverage evolving, adaptive logic. Doesn't that sound wonderful?

What about the practical application? To be specific, I'll reference Martin Fowler's page on Rake tasks (written almost five years ago!) specifically the part about synthesizing rake tasks.

I'm glad he used the word synthesizing instead of generating or compiling because the tasks are very much real, but only manifested upon instantiation. It was hard to wrap my head around the idea until the use of file modification times came into play. With their involvement, I began to see how substantially different tasks could be defined from a single "master" task. Almost the way an abstract class can become wildly different objects, depending upon their instantiation.

And I find it interesting to note the similarity here to lambda functions (aka anonymous functions): functions that can be passed as arguments to other functions. They too provide wildly flexible abstraction of concrete logic, while thankfully keeping the complexity of the domain space within reason.

Specific examples to come! Feedback and additional thoughts are welcome and appreciated!

Extension for Files Formatted with Markdown

| No Comments

Markdown is a simplified text formatting syntax. There are many like it, but it appears to be the one gaining the most traction. The "original" syntax is almost too-simple though, and there are a bunch of extensions, too.

As I've come to utilize markdown and pay more attention to others who do as well, I've taken note that some people are adding filename extensions to indicate that the file is formatted with markdown.

So far, I've seen:

  • .mdwn
  • .markdown
  • .md
  • .mdown

I like md the best because it is short, but it might be misleading as it doesn't explicitly yell "This file is formatted with markdown!".

Similarly, I also dislike .markdown the most because it is so long.

Personally, I've been using the ".txt" extension because at heart, markdown files are truly .txt files. Or maybe because back in the dial-up days I used to belong to a bunch of BBS's which hosted tons of cool "txt files"!

I think I'll probably start trying out .md.txt. I like the way extensions can be chained together, like .tar.gz or .tar.bz2, and then merged into a shorter one, like .tgz and .tbz.

What do you think is the best filename extension for markdown formatted files? Feedback welcomed and appreciated!

Related:

Let's Use Flat Files For Storage

| No Comments

I ran out of time while writing the blog post titled "What's Up CouchDB, Ledger, XML, BNF, Ragel, and Git?", so I had to end it before I got to one of the more important points: using flat files as storage.

Obviously not the best for real time data management, but in many cases it might work really well.

Ragel makes it relatively easy to write super fast parsers, so why not store blogs, project plans, finances, forums, and even something like emails in a git repository? Thinking about this makes me wonder if git is involved with Google Wave?

I find GMail's threading behavior annoying, but I also find the duplicity of quoted text in emails annoying. Why not use git to pass a single document back and forth as part of a conversation? It would be tracked with each version retained, and thankfully not needlessly duplicitous.

No doubt there are some very nice features and functions in modern RDBMS, but its a simple task to convert a structured document, like an XML document, into a SQL-powered database. That's what I'm considering doing with Regdel.

The canonical store will be in a ledger-cli flat file, and save for simple data entry, all data manipulation would be performed via the Regdel web interface. The flat file format makes it lightweight and perfect for version tracking with something like git. XML would work too, but its a lot heavier than the ledger format, and since ledger-cli exists, I don't see a reason for storing the data as XML.

If similar parsers existed to convert flat file data to XML structured data, I would be inclined to store even more information in git, and use an RDBMS to manage it. I don't know of any, but I'm pretty sure that there are some BIND-format DNS record parsers which can export to XML. If so, that would be another good example of something to store as a flat file in git.

Other Uses for Flat File Data Storage:

  • Firewall rules
  • ACLs
  • Postfix maildirs and / or aliases
  • pfSense / m0n0wall configurations (already in XML)

I've started saving my OpenOffice.org spreadsheets as flat-XML files. If and when I change them, the change can be more efficiently stored within git. I haven't tested it, but I imagine that after a few committed revisions, an oocalc flat XML file would be less than the same amount of oocalc compressed XML files - simply due to the fact that git would have to store a copy of each compressed file, instead of the differences between them. Git also compresses the old data, too.

Of course XML is not so much fun for humans to read and write, so its nice to think that for consistent data, something like Ragel can bridge the gap between consistently formatted documents and fully structured documents.

Anyone else find this an interesting road to follow?

Regex to Remove non-ASCII Characters

| No Comments

I had a file that I wanted to remove non-ascii characters from. I did some searching, got some hints, and came up with this:

cat filename.txt | sed -r "s/[^\x20-\x7E|\n|\ |\t]//g" > newfilename.txt

Worked for me on version Squeeze of Debian GNU/Linux.

What's Up CouchDB, Ledger, XML, BNF, Ragel, and Git?

| No Comments

Regular Docunext readers will likely know how much I like boring old technologies.

Stuff like static files, 3Com 3c509 network cards, IDE drives, and so on. Why? Because old technologies that have stuck around long to get boring means that they probably serve some useful purpose as well as being reliable.

No doubt I love trying out new technologies too, but since I'm appreciative of tested, tried, and true technologies I really love seeing old technologies getting combined in new ways.

I'm getting inspired to try out some new techniques with old technologies based upon some inspiration from Ledger, XML, BNF, Ragel, and Git. Git isn't really an "old" technology, but its purpose as a revision control system has been around long enough. Also, in its short life so far, Git has gathered an impressive following of users (including myself) who are testing it and trying it out and many (again, including myself) are espousing positive reviews.

Let me try and get more specific by highlighting some characteristics of the aforementioned technologies:

  • Git quickly manages static file revisions and changes and only takes up a tiny amount of storage space to do so.
  • Ledger uses a static file as its data source to run reports on financial data. The file format is quite simple and it uses a built-in parser to convert the data into a more machine-centric format.
  • XML is a reliable, robust, extensible and structured file format for which there are a ton of parsers making it easier than ever to transfer, transform, and manipulate data and documents. However, it can easily become burdensome for human beings to create or update. It can also require significantly more resources than some other structured formats like JSON or YAML.
  • BNF is a format for describing formats. It can be used to define tokens for use in building custom parsers. For example, Ledger includes a BNF reference that described how it parses plain text data files.
  • Ragel is a state machine compiler. Since file parsers are state machines, ragel can be used to build custom parsers. Ragel can't directly consume BNF files, but software exists to bridge the gap.
  • CouchDB is a document-oriented storage system, sort of similar to an RDBMS, but in my opinion, more like a dynamic filesystem.

In this day and age of complex RDBMS and proprietary "binary blob" file formats, it seems a little old-fashioned.

So what are the new techniques that I've referred to? Patience. I've just run out of time for this article so the deets will have to wait for the next one. In the mean time, don't forget about boring tech!

Categories

Recent Comments

  • Albert: Hi Richard, I wasn't ever able to figure this one read more
  • Richard: Same here... a year later... did you ever resolve the read more
  • Marcello Messori: That was the problem! Thanks read more
  • Cedric: Hi, Are find a perfect configuration to limit http request read more
  • Albert: I found this page helpful too: http://www.divvun.no/doc/tools/utf-8-setup.html read more
  • Albert: Its working great again, and I'm so pleased because Inkscape read more
  • anonhelper: well here is a better way to use landscape with read more
  • ngungo: Hi everyone, I could not make this piece of code read more
  • Albert: It does! pfSense uses RRD to track network usage. Here's read more
  • Kieran Osborne: Hey nice article! Ive recently setup a m0n0wall box after read more

Recent Assets

  • 20091220_odd_pstree_tree.png
  • 20091025_nexenta_boot_problem.png
  • Sleeping Hogs
  • Space Glenda
  • 200908_fwbuilder_gui_problem.png
  • pfsense_advanced_firewall_options_.png
  • Debian Iceweasel kFreeBSD Error
  • OpenPanel Classes

Find recent content on the main index or look in the archives to find all content.