Yaakov Nemoy: 2009

Yankee

Perhaps it's a bit early, but Toshio and i have been talking about a couple of crazy GSoC projects that will require a bit of a warmup cycle. I'm looking for a couple of highly motivated students who are looking for Summer Jobs and want to do something very crazy and awesome. If you're looking for something to do this summer, leave your email address in the comments, or use my contact information on the Fedora Wiki.

The first project is a content delivery project. Lately we've been bumping into issues with Fedora's No Content policy where people want to deliver wallpapers, themes, books, documentation, Gimp brushes, and so on that are essential to adding value to the packages we provide in Fedora. Alot of this content falls into a grey area that makes it really hard to decide if it should belong in Fedora. In most cases, however, it really doesn't belong there. Faced with the decision to turn down content submissions, we'd like to create a repository where we can point people to instead.

The aim of this project is to put together a content database system and some delivery mechanism and standard to get content to people's computers. What the database is going to look like, and what kind of frontend is completely open to the student working on the project. In the coming months, the goal is to sit down and discuss how you want to solve the problem, and how you plan to go about doing that. We'll plan out what standards already exist to solve part of the problem, who is using them, how to handle licensing questions, how to store data to make it friendly to the systems involved, and what's the best way to locally 'install' it to the computer. A project like this can become very complex, and the biggest challenge is refining it into a 12 week project.

The second project is a bit simpler. We want reliable download statistics from mirrors who mirror linux distributions in general. If you are familiar with Smolt, this will sound very familiar. We're looking for someone to write some tools mirror managers can use to gather statistics on their machines and a way for the stats to be gathered publicly. The details of the implementation are left to the student, but since we're looking for the right umbrella organisation and place to host the project, we are also recruiting now.

Both of these projects will most likely lean heavily on Fedora for support, since we plan on making heavy use of both of them once they are production ready. Both projects will be under a different organisation though, and you do not need to be a Fedora contributor to participate. The only requirement i have is a couple of years of experience with Linux.

Yesterday i took the day off from work in order to finish moving to a new flat across town. It's with great sadness that i won't be able to tell people i live in the Nude anymore. If i ever feel like being in the Nude though, it's not far away at all. It's almost literally across the street even.

I can honestly say i came out of the Nude on bike though.

1e Nerdavond in Wageningen

Last night we held the first Nerdavond in Wageningen at the Dikke Draak.

Exit Interview for GSoC

A few days ago i sent Satya an exit interview for the human interest angle of things, that she can voluntarily answer publicly or privately. Since this interview, namely the questions, might be useful to someone else, i want to post them here:

What did you learn this summer? Tell me both about technology and
community, and about things you learnt because of GSoC and otherwise.
What motivated you to keep going for three months and not just stop
after the first week?
What kinds of frustrations and challenges did you come across this
summer? It doesn't have to be GSoC related either.
How did you overcome these challenges?
What else did you do during your free time in the summer?
What motivated you to do this in the beginning in the summer? What
need did you feel like you were filling?
Is this still important? Or perhaps you have other reasons for keeping up?
Given the chance, would you do this project again?
Given the chance, would you do another GSoC project again?
Did you make any good relationships with in the various FOSS or hacker
communities?
What are your future plans in life?

Officially, the Ubuntu village was slightly better represented than Fedora, but i think we had alot of great word of mouth advertising. The people who needed Fedora CDs knew exactly where to come to.

You can definitely say we had better uptime and presence at the tent.

Why Setuptools?

I've been having alot of conversations with Jeroen (kanarip) lately about using setuptools instead of autotools for Python programs. He's not convinced, and to be honest, some of our projects in Fedora won't trannslate easily from an autotools setup to a setuptools setup. There's always room to improve, though it would be nice if autotools could automatically generate egg.info metadata and provide a convenient setup.py (or even a pavement.py).

Let me point out though why you might want to consider using setuptools instead of doing it wrong.

Part of the PEP 8 and PEP 328 is that absolute imports are used as often as possible. The idea is that you specify exactly which module needs to be imported to make the code work. Even if the code is relocated, it still works. It also means that the system can arrange the files any way it wants, so long the Python interpreter sees the same structure for the modules. This lets distro developers and operating system designers put Python modules wherever they feel necessary, as well as letting tools like virtualenv override the OS defaults on a per user or per application basis.

However, if you use an 'alternative' directory structure in your project, you might find yourself running into trouble. Say you have a module you're building called 'foo'. In your source directory, you have some directory /src/foo/some_code.py . There's also some script /run_foo.py at the top level of the source tree. If there is a line in run_foo.py that calls 'from foo import some_code', Python will raise an exception that the module foo cannot be found.

One common, although very wrong and messy solution is to catch said ImportError. In the corresponding catch block, there is alot of code to check to see where the code is being run. Then it digs into the internals of the module import system in Python so that it pretends it can really see the module foo and all its submodules. This is a rather unfortunate way of writing code, because it really limits what you can do in the source tree. For every script that you have, you may have to copy and paste this code. Furthermore, if you change the layout in the source tree, you may break all your scripts and have to change them manually. Finally, you have to question the sanity of putting file system specific code in a program. There are clear places where such code makes sense, but tinkering with the filesystem and path environment is not one of them.

Luckily, setuptools can do all this work for you. It can translate any on disk file system layout, such as /src/foo/some_code.py to foo.some_code for you. It can also direct installation to work properly. It can even modify your Python path environment so you can work directly out of your source tree. Any changes you make will show up automatically. Finally, we can isolate all the code into one place. The entire mapping is just a simple dictionary in one file, which is analogous to a configure.ac file.

I may post something later about where setuptools fails, and how we might be able to work around it.

Yankee

op 00:44

0 flames

Fedora == Debian ?

Jeroen just told me that Fedora's Everything spin fits on 33 CDs (for the 64 bit version), which is just about as many CDs as Debian takes up. I guess we're the new debian.

I want a Fedora Everything on floppy. My work machine, despite having entirely new hardware inside right down to a complete lack of PATA controllers, has a floppy drive. I want to relive the experience of installing Debian via floppy, but this time in Fedora. For old times sake. I want a pony.

Haskell Bindings to C from Start to Finish

Out of curiosity i wanted to learn how to put together bindings from C to Haskell. The primary need for this on my radar is enabling X composition directly in Haskell, to enable 3D effects in XMonad. Haskell is a great language for doing graphics work, so there is definitely good sense in providing bindings for such things.

However, working with X can be a bit of a large project as the build systems and workflows for bindings are already relatively complex. I decided to familiarize myself with C2HS, which seems like the future of Haskell bindings, based on the brief bit of research i did. Another important set of bindings we may need in the future are good bindings to the RPM library. Problem here is that there is no good documentation on testing bindings with C2HS from start to finish. What follows is roughly a guide to the results i got, minus the swearing, crashing out in a nervous caffinated wreck for a few hours in between, and the tasty "Hollandse Nieuw" herring i had when i woke up.

(This year's herring is really tasty, if you get the chance to visit Holland, try one or two of them.)

The following is just bare metal work, in order to test the basic functionality of your bindings. There will hopefully be posts following this one on how to use autotools and cabal in order to build packages. (As soon as i figure it out myself.)

C2HS's workflow can be summed up as this. Write code in the C2HS format which is passed through C2HS as a preprocessor, which yields Haskell code. You run this code through ghc with extra command line flags to link the right libraries, and it yields a nice Haskell Library that you can link to later. Then just import your code in your program like normal, and you're good to go.

As a sane demo, i am starting with the sample code from the RPM documentation. This can be found here under Listing 16-1

First off, make your project directory. For me this is /rpm/. Since i want the library to be RedHat.Rpm.RpmLib i also need a directory /rpm/RedHat/Rpm/. (I'm debating if i want to keep the RedHat prefix.) Then create a file /rpm/RedHat/Rpm/RpmLib.chs, and immediately begin it with:


{-# LANGUAGE ForeignFunctionInterface #-}
{-# LANGUAGE TypeSynonymInstances #-}

#include <rpmlib.h>

{# context lib="rpmlib" #}

module RedHat.Rpm.RpmLib (
  rpmReadConfigFiles
  ) where

import C2HS
import Foreign.Ptr

rpmReadConfigFiles is the only function we need so far from RPM. The first two lines enable extensions to GHC. The first is for doing Foreign functions, the second will be used later. Then include a standard CPP include statement. Finally we declare the module and the tokens it exports and import two modules we need from Haskell. Some of these details can be put off into the build system later, but since we are working with the tools directly, we can just put them in our code.

The function we want to bind is defined in /usr/include/rpm/rpmlib.h as such:


/** \ingroup rpmrc
 * Read macro configuration file(s) for a target.
 * @param file  colon separated files to read (NULL uses default)
 * @param target target platform (NULL uses default)
 * @return  0 on success, -1 on error
 */
int rpmReadConfigFiles(const char * file,
  const char * target);

This presents a bit of a unique problem. Normally a C style string presents two possibilities we need to account for. Either it contains characters or it's an empty string. Haskell can handle this equally as well with the String type. However, here we have a third possibility, the two in parameters can be null pointers, which is not exactly the same as an empty string. Internally we can handle all three cases in Haskell as a Ptr CChar which lines up exactly to the C function. However, at the outer levels, we really need to create a function that can accept either a null pointer or a String. In order to handle this, we need a new class of RString, such that:


class RString t where
    withRString :: t -> (Ptr CChar -> IO a) -> IO a

instance RString String where
    withRString s m = withCString s m

instance RString CString where
    withRString cs m = m cs

One of the gotchas of programming C level code in Haskell is that all C level code, namely pointers, need to run inside the IO Monad. There simply isn't a (safe) way to convert something like a String to a Ptr CChar outside of IO. C2HS makes use of a withT function pattern to marshall pure Haskell data to pointers to objects in C.

In our case, if we get a String, whether containing data or empty, we need to marshal it into a CString. If we get a Ptr though, we can assume it's already been marshalled. Chances are, there are cases that can break this, but for our simple example, there's relatively little harm we can do. In any case, withCString first marshals the String into a CString and then does pretty much what the Ptr version of our code does.

With all this squared away, we can define our function pretty much as per the documentation for C2HS.


{#fun unsafe rpmReadConfigFiles
      `(RString s)' =>
      {withRString* `s'      ,
       withRString* `s'    } -> `Int'#}

This creates a new function that cannot call back into Haskell code, accepts anything of RString class, and marshals it with our code. This returns an IO Int. This is it for our binding.

The fun part is compiling it. The first step is to run c2hs on the .chs file. We need to include parameters to the C Preprocessor that tells c2hs where to find the C header files. We also use -l to copy C2HS.hs into the same directory, as it is needed by ghc later. The next step is to run GHC on the resulting .hs file.


/rpm/RedHat/Rpm/ $ c2hs --cppopts='-I/usr/include/rpm/' -l RpmLib.chs
ghc --make RpmLib.hs

Now that this is done, the next step is to build an executable that uses this binding in order to test it. This file, rpm1.hs goes in /rpm/


module Main
where

import RedHat.Rpm.RpmLib
import Foreign
import Foreign.C.String

nullP :: CString
nullP = nullPtr

main :: IO ()
main = do status <- rpmReadConfigFiles nullP nullP
          putStrLn (show status)

Since we defined our class only for Ptr CChar (aka CString) and not for the general Ptr a, we use a helper to type cast nullPtr. There's probably a more idiomatic way to do this, but all we need is a kludge. The rest pulls the integer result from the function and prints it on the screen. To compile this, we use the following ghc magic. It includes another call to link in the rpm library.


/rpm/ $ ghc -lrpm --make -debug rpm1.hs
/rpm/ $ ./rpm1
0

That is from start to finish how to write your own C bindings in Haskell. Hopefully i'll figure out how to get this to work via cabal, so i don't need to run so many commands to run tests.

Yankee

op 23:25

0 flames

Fedora 11 Release Party Wau Report

I've been a bit remiss in reporting on our release party in a timely fashion, but better late than never.

Our plan was to have an outdoor hackfest following the tradition set by eth0:wau last summer. The rules are simple, grill meat, drink beer, and sit around with the computers outside as long as possible. Last year, it yielded this:

Perhaps we did better this year. I'll let you decide.

We started at a typically undutch time, an hour late. We had the fire up and going at around 6, but it wasn't till nearly seven when we started preparing food in the kitchen. People started showing up around then and kept coming as the night ran on until around 11. Our release party ran along side the birthday parties for two of Alex's housemates. Without that, it might have just been us nerds sitting around in the cold night weather hacking around. As it turns out, some of the guests are also a bit technically oriented, so a couple USB keys laters, we may have a couple switch overs to Linux.

Seeing it was a birthday party, and we had a bunch of extra I <3 Fedora shirts, gave the two guests of honour a very nice Fedora birthday present. Here's Piu, the cat, trying to get in on the action again.

The night ran on and it was the usual share nerdery, installing Fedora on a few machines and swapping files around via a sneakernet. Never underestimate the bandwidth of a truck full of DVDs. In the end i made it to around 3.30, where Alex said i was acting especially loupy, right before i fell asleep. A couple of people made it through all night long, though in the end pretty much everyone got a few hours of sleep at one point or another. As we all woke up at different times, we didn't get around to having a proper breakfast until close to 17:00 the next afternoon. As Alex keeps chickens in his backyard, we were treated to some very fresh eggs.

Building blocks for world domination

I've been throwing around an idea for a few days that might be useful for the python ecosystem.

It's really easy to praise the bazaar way of doing things. Because of it, i can find four different libraries for manipulating pathnames in Python. There's the built in os.path which comes with every python distribution. There's pathtool which provides a nice api for walking a file system, which can replace the function 'walk' that's ordinarily provided by the Python Standard Library. There's Unipath which replaces paths as strings with paths as objects, to avoid a certain class of type based bugs. It also overrides standard behavior with what the author presumes is more logical behavior. There's path.py which provides the same but is 2 years old, and as far as i can tell, dead upstream. There's a fork of path.py which is in paver, which provides an interesting feature available to Python 2.5 and up, 'pathd'. This is a context manager that works in a different directory than the surrounding code. (No, i won't translate that for non Python users.) I've also had to do something similar in devshell, for the obviously similar reasons.

The end result is that there are so many good ideas out there that we really don't know what to do with them. An intermediate Python developer looking to put together a large application doesn't quite know what to do with all these different options. Does he pick path.py which appropriately follows the os.path behavior? Is Unipath better with it's object oriented behavior? Should he use the fork in Paver and require Paver to be installed with the application? Is doing this kind of research even a good use of time that would be better off spent just writing the tools in a util module on the side? Without prior knowledge or experience, this can muddy up the ecosystem.

This is of course not unique to manipulating pathnames. I'm finding that i'm looking for a wide range of reusable components in my Python work. It is hard for me to decide if i want to build my own parts or work with prior ones. Browsing the Python Cheese shop can be a daunting experience sometimes, although great when looking for brilliant ideas.

I think there is alot of room for a reusable component library. The goal of this library would to pick the best of the breed components and collect them all in one repository. As new ideas spring up in other libraries, we could cherry pick them and port them to the upstreams where necessary. It would exist outside other large projects like Zope and Twisted which provide other large building blocks for certain domains of problems.

Potential types of building blocks could include:

Pathname manipulation
Configuration file and runtime options management
A unified set of functional programming inspired functions (I see way too many implementations of them)
Implementations of more advanced functionality than provided by the Python Standard Library
Add your own itch here.

I think the end goal would be to provide a platform of libraries that are blessed by the community at large, but don't necessarily have to be included in a default Python distribution. It'll hopefully let systems programmers and application developers do their jobs with one place to find all the information they need.

Anyone interested?

remplaza_fecha('maandag 7 december 2009');

Yankee

remplaza_fecha('zaterdag 5 december 2009');

Yankee

remplaza_fecha('zondag 29 november 2009');

Yankee

remplaza_fecha('donderdag 26 november 2009');

Yankee

remplaza_fecha('woensdag 25 november 2009');

Yankee

remplaza_fecha('maandag 23 november 2009');

Yankee

remplaza_fecha('');

Yankee

remplaza_fecha('dinsdag 17 november 2009');

Yankee

remplaza_fecha('zondag 8 november 2009');

Yankee

remplaza_fecha('zondag 25 oktober 2009');

Yankee

remplaza_fecha('dinsdag 20 oktober 2009');

Yankee

remplaza_fecha('maandag 19 oktober 2009');

Yankee

remplaza_fecha('donderdag 15 oktober 2009');

Yankee

remplaza_fecha('');

Yankee

remplaza_fecha('woensdag 14 oktober 2009');

Yankee

remplaza_fecha('woensdag 7 oktober 2009');

Yankee

remplaza_fecha('woensdag 30 september 2009');

Yankee

remplaza_fecha('');

Yankee

remplaza_fecha('dinsdag 29 september 2009');

Yankee

remplaza_fecha('zondag 27 september 2009');

Yankee

remplaza_fecha('zaterdag 26 september 2009');

Yankee

remplaza_fecha('vrijdag 18 september 2009');

Yankee

remplaza_fecha('vrijdag 11 september 2009');

Yankee

remplaza_fecha('');

Yankee

remplaza_fecha('');

Yankee

remplaza_fecha('donderdag 10 september 2009');

Yankee

remplaza_fecha('maandag 7 september 2009');

Yankee

remplaza_fecha('vrijdag 28 augustus 2009');

Yankee

remplaza_fecha('');

Yankee

remplaza_fecha('dinsdag 25 augustus 2009');

Yankee

remplaza_fecha('');

Yankee

remplaza_fecha('maandag 24 augustus 2009');

Yankee

remplaza_fecha('vrijdag 21 augustus 2009');

Yankee

remplaza_fecha('woensdag 19 augustus 2009');

Yankee

remplaza_fecha('dinsdag 18 augustus 2009');

Yankee

remplaza_fecha('');

Yankee

remplaza_fecha('');

Yankee

remplaza_fecha('vrijdag 14 augustus 2009');