Awk in Python: Incrementing the Release tag

Yesterday, i commented on how you could get Awk like behavior with Python. The more FP oriented reader might notice that they are pretty much Arrows. At this point, you may wonder why i'm doing this in Python if Haskell would be better suited to doing Arrows. Lately, i'm wondering the same thing, and i think it's about time i start programming more in Haskell. I think it would have to be a project i don't actually expect anyone to use.

Today, as a demonstration, i decided to start with something recognizable. Currently, in the Fedora Developer toolchain, we have a number of scripts and tools that do various tasks. Unfortunately, these tools are written in a style very close to shell scripts, which makes it hard to script actions on top of them. One goal of Devshell (among the many), is to have a library of common tasks that can be later scripted or built up upon. Rather than being forced to work at the shell script layer, any code that can access Python functions and objects can run these commands.

The first victim to undergo the scalpel is rpmdev-bumpspec. The following result is not 100% compatible with the original shell script, but it is part of a small library that will be able to modify a spec file in more than one way. Currently, any tag can be edited, the release tag incremented, and changelog entries inserted. It currently lets you also query any spec file for any RPM tag via python, and it handles calling out to the command line in the background. It will even zip it up into a list of dictionaries for your result. It also divides cleanly between the definition of a spec file and how to modify it.

Take me to the code!

Recreating Awk like behavior in Python

In Fedora Devshell, one feature i would like to support is turning a revision control branch of fedora specific branches into a set of Patch lines, and converting a set of patch lines into a revision control branch. I figured there would be two ways of doing this, the first parsing the entire spec file and recreating it later, the second using stream editing via the well known Unix tools such as ed, sed, and awk.

The problem i had there is that while full out parsing is useful in certain conditions, you risk having the result of the parser not being exactly the same as the input. For example, if a developer stuck a Patch: line in a spec file after the BuildRequires, but the rest were before the BuildRequires, a poorly written parser might reorder the file, unnecessarily. This is the kind of headache that would make writing a full out spec file parser a several day long project that would be prone to error and annoying to test. I decided to go with the latter method.

Awk is a really cool program if you ever learn to use it effectively. The problem though is that scripting code from Python to Awk would get unwieldy. It would require creating one more layer over a DSL, domain specific language, that is just as troublesome as writing a full parser. Still, i would liked to have programmed in a set of semantics for making all kinds of changes to a spec file from the bottom up. Instead, i've recreated some of the core features from Awk in Python, as a sort of complex parser.

This pythonic awk parser lets you create all kinds of patterns and composites of patterns. It also lets you create various handlers or composites of handlers for various patterns. Finally, it lets you compose a series of patterns into a single awk "program" represented by an instance of the Awk object. Since the process method accepts an iterable and yields a generator, multiple awk programs can be chained together. This only took a few hours to do.

Click here to see in the fedora-devshell.git repository.

Fedora Test Day - Nouveau - Experience

Today i participated for the first time in a Fedora Test Day. Conveniently i had a few hours free today, so i decided to devote some time to making sure my nVidia chip will work well in the next Fedora release. The short answer, it does :)

The long answer, it does, but only after i had to jump through many hoops to find this out. Being the slightly environmentally conscious person i am, i decided to attempt to spare the life of a tree or two, and pull up a USB stick for testing. I would also gain the highly beneficial advantage of being able to store bookmarks on the USB key. This way, after rebooting, i could quickly call up the next test i would need to run. As a side point, i think it might be beneficial in the future, if images generated for Test Days would have a bookmark directly to the tests needed to be run. I think it might save a bit of time for people who are testing it on their only machine.

According to the instructions on the wiki, i installed an updated version of syslinux from rawhide in order to copy the image to USB. Following that, i ran the livecd-iso-to-disc tool which ran successfully. However, when i went to reboot the machine, the USB image could not load initrd, and the process failed. Apparently, i was not the only person who had that problem. Then i tried taking a few different steps.

First i tried to see if i could install livecd-tools from rawhide. Not only did it try to pull in the kitchen sink, but it was also missing dependencies. Yum could not even complete the operation, were i to let it.

Second i tried to recompile the f11 SRPM for f10 via mock. While it completed successfully, and the resulting livecd-iso-to-disc program could complete successfully, i still could not get the LiveUSB media to boot up.

Third, i tried setting up a rawhide VM. Since i used boot.iso which is essentially a netboot install, it took the better part of an hour to install. Also, after stripping out office utilities and other nonsense, i was told that there was not enough room to install Fedora in the default 4GB of space virt-manager uses. I hod to go back and tweak the install, and for a KVM instance, it was really slow as molasses. It certainly didn't give me the warm fuzzies inside, let alone visions of a pony prancing in a field. I'm assuming that the space needed not only included installing the OS plus working space to do so, but also the space needed to download the half gigabyte of packages. I'm wondering if this counts as a bona fide bug.

Finally, when the VM was ready, i installed the livecd-tools package and copied over the iso to the VM. Then i ran the tool, went and made dinner, went off to stammtisch and came back 4 hours later, and it was still running. All the virtualization in the world is not gonna help you when you're IO bound. At this point i broke down and burnt myself a damn CD. Somewhere out there, a family of trees is crying for their daddy or mommy. Here's where there is some pretty epic fail. I'm sure it's not a bug that the tools in Fedora 10 are not adequate for setting up a LiveUSB for Fedora 11. I'm sure there are always aspects to a kernel that the previous OS can never anticipate. Looking at the sheer number of people who participated in today's test day, i may be the only one here complaining, but for the love of $deity and all that is holy, is it so hard to make it possible to put Fedora 11 on a USB stick via Fedora 10?

Well after that, the tests went really well. I gave everything a good working through and it's always a pleasure to see Fedora getting better. Given the turnout, i think that the Test Day is so much of a success that the QA team is gonna be far too busy to even look at my ideas here.

It would be nice to put a sticker here saying "I participated in a Fedora Test Day", just like we do with elections.

Cooperation for peace?

Sometimes i feel that many Americans just don't understand peace. This is a poor generalization, because many people are intelligent, well educated or well traveled, and get a fair look at the way many different cultures can live together. But sometimes it seems that the supposed Joe Public doesn't get it, out of living in the monoculture that is the US. I want to highlight something that might not get picked up by the mainstream media otherwise, even though it's a bit off topic

Edinburgh Scotland Muslim Leaders Offer to Guard Synagogue

As contributers to Open Source, we all know how to contribute our resources to all sorts of projects regardless of who is involved. Sometimes it's hard to see this willingness to get along outside of our daily Open Source work. I wonder if the solution we need in the Middle East and western India, as well as in parts of Western Europe isn't just western leaders getting together in smoke filled rooms, but the lessons we know from working with Open Source.

(As a side note: "Vos iz neias" is a colloquial yiddish phrase that means "What's up". The link points to an online resource of news relevant to primarily Orthodox Jews.)

A Random Idea for Mirroring

Just a random idea to come across my brain.

Problem: Frequent and even automatic live and installable ISOs are known to be good for testing and event just making sure nothing breaks. They should at minimum be part of any tinderbox style testing, and preferably mirrored for easy global access. However, ISOs are not rsyncable and require massive amounts of bandwidth to be sent over frequently to the many mirrors.

Preexisting bits: We have all the requisite packages already being sent to all the complete mirrors on a regular basis. We also have jigdo, which can compose media from a mirror or a local data store.

Solution: Create either a push command from the mirror manager, or a pull command available to the mirrors to let them know good media can be composed from the current existing up to date bits. After a mirror updates itself, it can then use an automated script to generate media using the preexisting media compose tools we have and jigdo if needed. The md5sums can be used to verify that the media composed locally is the same as the media composed by our tinderbox testing system.

Pie In The Sky: In a local infrastructure where one mirror box pulls from the central mirror, and then the local boxes mirror the local source, the local mirrors can be configured to just pull an ISO from a local source. Likewise, a compose box could be set up to do the composing separate from the mirror, which could also make the ISOs available.

Well, it's just an idea. If only there were another 8 hours to the day.

Output from the python EKG

People have asked, so here's what it looks like so far.

I can't believe it

I came in on schedule. In rewriting EKG, i made a few estimates and handed them off to Max as a set of deliverables. Unfortunately, reality got in the way and the man hour schedule got pushed off course a bit because of real world hours, namely taking care of school work and other things. I also had to stick in a bunch of haskell work in, as we've finally gotten some amazing momentum going there.

On the way, i decided to get fancy too. I realized that we would need to have a framework in EKG that could give us room to grow in looking at the things we analyze. I took some of the lessons i learnt doing the framework for Devshell and applied them roughly to EKG. Because of this, i ran into some technical issues handling multiple inheritance in Python. Let's just say that multiple inheritance, while it has it's uses, can end up really bad. Several reworkings later, i was a full week of real time off schedule.

This design decisions has paid off immensely. As a result, it took me 4 man-hours tonight to put together the basics of the reporting element. It's not quite as sophisticated as the version 2 grapher, nor does it do kevin bacon graphs yet, but it at least has the number one asked for feature, deltas. People, and by people, i mean the persons that pay me, want to see a numerical difference per month, per quarter, and per year, and all this is per domain.

I also took it one step further. You can even get the deltas of a particular month or quarter over multiple years. It's actually just a general purpose algorithm that can do any delta query on any collection of counts.

When I made the original estimates, i estimated around 4-8 hours for recreating the grapher tool. According to my timer, i've spent 4 man-hours on it tonight. It's closer to 6 or 8 because of constant fidgeting and moving around; i've been very ADD tonight for some reason.

Next up for EKG
Currently i'm spitting out HTML with string fudging. There are some nice libraries for generating HTML, i need to package one i like for Fedora and include it. I could also use templates, but i'm sick of doing things the MVC way, and i want to experiment more with Object-Proxy development.

Michael DeHaan's found a pretty cool javascript library that has some pretty widgets for tracking changes, and doing pretty pie graphs. Problem is, is that it will need some nice TLC to make it useable, and then pacakged as well. Things should get interesting.

I should also do a couple of blog posts about some of the cool tricks i put into EKG to make it work. Maybe we should think of a Fedora related Python code snippet database too.

An Anecdote about why doing the wrong thing is bad.

Actually, this should be obvious to some, but, apparently, people love to do the wrong thing.

Taken from: ext4, application expectations and power management by Matthew Garrett


[1] xfs behaves like ext4 in this respect, so the obvious argument is that all our applications have been broken for years and so why are you complaining now. To which the obvious response is "Approximately anyone who ever used xfs expected their data to vanish if their machine crashed so nobody used it by default and seriously who gives a shit". xfs is a wonderful filesystem for all sorts of things, but it's lousy for desktop use for precisely this reason.


Once i was using XFS as my desktop file system. Anyone who knows me knows i like to do things differently just because i can. That's just how i do things.

One day, my system decides that it wants to crash, probably something to do with JBoss, but i don't remember exactly. (Not to offend JBoss developers, i was using a version that wasn't meant to work.) When i bring my computer back up, Firefox complains that there's something wrong and i can't start it again. I essentially lost my entire Firefox profile.

Let's be honest here, XFS is probably a bad idea to be used for a desktop. Let's just pretend someone's running an LTSP server on top of XFS. Or rather, someone's running that on ext4. Let's just forget about the LTSP part then, an average user. I'm sure alot of Firefox profiles have gone down screaming in flames. I'm sure some one has used this as an excuse to blame Firefox for being shitty, and then gone off blaming Open Source for being so bad.

The real problem isn't exactly what one programmer made as a mistake. A huge percentage of code is copy and paste from other projects. It's really easy for mistakes to persist over a large number of projects. I really can't blame Firefox for being any more at fault than XFS, Fedora, Open Source, or Buddha. This was just a moment where i went "Aha! This explains everything". Well mostly everything.

GSoC 2009 as a Feature

Following my previous blog post, i've jotted down a few more notes about how to do GSoC projects as Fedora features. It's really just a rough draft and could probably do with a few examples and links and other things. You can find it here:


This is really relevant to both potential Students and Mentors. I'm hoping that for at least some of the projects, people will be willing to experiment using this workflow to make things happen.

Google Summer of Code is on.

It's official, there will be a GSoC this year. Once again i am looking to mentor someone this summer. Last summer, it was a complete failure. I feel as if there was not enough passion about the project to motivate my student.

This summer i would like to do something very different. Currently, Google is taking applications from mentoring organizations, which means that it is not even time for students to get started putting together their applications. I think it is fairly likely though that Fedora will have a few spots available at least. Therefore if you're looking to get a head start, this is just the blog post you want to be reading.

Given the success of the Feature List in Fedora, i want to model a project around that. I want to find someone who is passionate about an idea, and is willing to communicate it to the world. What i would like to see from an interested student in the next few weeks is a very open ended project proposal. It can cover any aspect of Open Source, but it has to be written as a Feature proposal. The goal is to produce a Fedora feature. You can do it as part of other concurrent development in Fedora or as a completely new concept. What's important is that it has to be innovative enough to be worth putting in the Release Notes.

I'll probably write more about this idea in the next few days. But if you have that itch you're looking to scratch in the Fedora and Linux world, you can get started right away.

Last but not least, if you put together a Feature proposal and submit it as a GSoC project, but do not make the cut, remember, you can still work on the project. As a student, contributing an entire "Feature" is an amazing resume item.

Life is a pony farm

I'm a packager

Bug 426751 - (ghc-X11) Review Request: ghc-X11 - A Haskell binding to the X11 graphics library.

Reported: 2007-12-25 15:15 EDT by Yaakov Nemoy
Modified: 2009-03-03 06:26 EDT

I'm in your Fedora messing with your CVS.

It's been a fun adventure :D