Do we really need another packaging system?
Recently I've been quoted by Steven J. Vaughan-Nichols for questioning the need for the LSB Package API. The kind of conversation going on over the LSB Package API has been a recurring theme ever since I started using Linux, and it contains quite a few fallacies I would like to put down. Most importantly, the most common theme I've heard is that this fancy new system could usher in a new era of easy package installation for users. Let's go over some of these myths.MYTH: PackageKit is just one level of abstraction on top of package management that serves no purpose other than to make things more complex.
REALITY: As stated in said article, Linspire has also created such a layer. There is a huge difference between PK and CNR though. CNR is really more of a user based shopping center for packages that traditionally wrapped on top of Apt. I haven't looked at it at all in the past year, so I am not fully qualified to comment, but it seem to maintain that overall approach from what I've seen.
PackageKit integrates tightly with the Gnome and KDE desktops using some cross platform tools to make managing a system easier. Namely, PackageKit integrates with ConsoleKit and PolicyKit to let administrators fine tune the users control over packages installation. It is a far better solution than giving users access to sudo.
Neither of these layers are redundant. They provide a level of service well beyond what a basic package manager can provide. Furthermore, they provide it in a clear cross desktop and cross distribution fashion. The need for these two tools is clear.
MYTH: There are too many package managers, and the landscape is threatening, and unhealthy for the Linux future.
REALITY: Package managers are developed all the time, because every user needs something different. For the desktop user, package management all looks the same, but this is not true. For the enterprise user, package management is a crucial feature. For the enterprise desktop, being able to mix both these use cases is an optimal solution. Let's examine each one. In this case, we're concerned only with dependency resolution, because the rest is mainly a usability layer on top.
The first reality is that not all package managers do dependency resolution the same, because that's the key point of a package manager. For example, there are a number of dependency cases that only a handful of package managers can solve correctly, such as the libBCD use case. The last time I checked, both apt and yum fail to solve it, but zypper has no problem with it. Even if the user doesn't care, having bad dependency resolution on the desktop can make life difficult for new users as well as advanced power users. openSuSE had the drive to innovate package management a little bit, and it's paid off by having a quality dependency resolution.
Other distributions, such as Gentoo and Crux take two totally different approaches. Admittedly neither of them are intended for the average desktop user, the idea of dependency resolution on both those distros is entirely unique.
In the enterprise, a package manager is not just a tool for delivering software once to a system, but a tool for creating bits of a system out of a variety of components. Deploying server applications, such as java applications can require an incredibly complicated and thorny dependency resolution process. It can include a number of criteria, such as prefered way of deploying java packages (there's about 15 different ways to do it), deploying custom kernels based on hardware, intended work load, configuration of the day, or one-shot actions such as updating the bios or other firmware. It could include other external matters, such as deploying language files for a certain region, because a laptop has been transfered from one department to another as part of an enterprise reorganization. This is overkill for the desktop user, and it is critical that there be more than one package manager in the Linux landscape for this reason alone.
(One of the values of yum is its plugin system. An enterprise administrator can develop all these plugins using the same package manager that normally fits on the desktop.)
Having PackageKit subject to this complex dependency resolution process is good for the enterprise desktop. Certain 'power users' in a company can be given permission to install extra packages from trusted company sources as is needed. Even updates can be handled by the user. To an administrator, this situation is ideal, because he receives two guarantees. 1) the user is getting the right packages, and that the system will likely not break down. 2) the user is empowered to do their job and is a satisfied coworker.
Thus, the conclusion so far is that PackageKit and multiple package managers are a clear necessity to the Linux Community.
MYTH: There are too many packaging formats, some are too archaic. Why can't we just settle on one format for writing packages?
REALITY: The LSB Packaging API uses XML. For that reason alone, there will be at least two packaging formats. Not all of us like XML. Nor is XML the ideal format for delivering package information to every system. For the reasons I stated above, different formats are needed for all the different use cases. A Crux pkgbuild or a Gentoo ebuild is a packaging format fine tuned to the systems they run on. Forcing them to use some other method to package their programs wouldn't make any sense at all.
In the article, the author writes:
Hell, if you really think scripting is the best way to install
software in 2008, why not go back to installing everything by
compiling all software from source code? That works even better
and it will work on any Linux distribution. It’s tedious and
requires every Linux user to also be at least enough of a
developer to be able to deal with gcc, make, and library
compatibility issues, but hey it would work.
This seems to imply that there is something old fashioned in using shell scripts. This couldn't be further from the truth. In creating nearly all packages, there is always some amount of shell scripting that needs to be done. Many packages require some form of post-install and pre-uninstall scripting to be done. The most universal method for doing this on any distribution is a script. The best part of this though, is that it works entirely behind the scenes. I never have to interact with RPM running any script, as the process is completely hidden from the end user. Putting a layer in C on top if this only makes the life of a package maintainer harder.
MYTH: We should have some system of universal packages. The user should be able to get a package from just about anywhere, and it should run automatically on their Linux system.
REALITY: I am not going to go into the MSI installer debate that has been argued to death. Let's address some of the real issues that face the Linux desktop.
Every distribution uses slightly different compiler settings. There are a million and one reasons for doing so, but in effect, no package, neither RPM nor DEB, nor anything else is guaranteed to run on every Linux system. Having some universal package API will not solve that problem, and will only draw users away from the real target they should be focusing on.
Third party software that hasn't been developed with your distro in mind presents a strong problem for everyone. Chances are, it will break your system. Having a common API will not make the problem easier, because there are always a number of other layers to worry about. For example, if the program doesn't ship with an SELinux policy, it will break for all confined users. Using something like the SuSE build service to get the latest Amarok for Fedora would be pointless. Your distro may use an alternate file naming scheme, such as GoboLinux. RPMs would have a hard time working in such a system, let alone some random API.
The reality of the situation is that the differences between distros gives people a chance to innovate more with how a Linux system works. Trying to funnel everything down through some common layer at the bottom really reduces the chances for big innovation to happen. Steven comments that layers like PackageKit and CNR are bandaids on top of a failed process. This is not how I would describe it. They are tools to make the confusing landscape easier for the end user, while letting distributions maintain their competitive advantage. I highly doubt that such a common API will be effective, let alone accepted easily by the community.
But please, prove me wrong.
16 flames:
Very well written.
What is the libBCD use case, yum can't handle ?
the libBCD case is where program A depends on B, C, D, which depend on libB, libC, libD respectively. There is also a library libBCD that provides the same functionality. You can find out more about it here: http://duncan.mac-vicar.com/blog/archives/310
I posted a response to SJV's original blog, but here is a direct response to your comments. As a longtime user of FOSS software and developer of a FOSS project, I believe that there are several issues that must be handled by a package manager:
- Installation: The hardest part of this is dependency resolution. However, many people assume that shared libraries is all that this means. In reality, there are version dependencies for interpretors, such as Perl, PHP, Python, or Java (including any special libraries not in the base), various middleware packages, databases, and possibly even drivers.
- Startup: If the application includes a deamon, an init script has to be installed if it is to be started at system boot, and this should be done according to the rules of the distro's system admin programs. But even user applications have to be installed in directories in the users' PATH, and they have to be able to find configuration and runtime data, and they must have permission to write logs and program data.
- Application documentation: This is too often overlooked completely. Man pages, READMEs, info files, and HTML doc must be installed in the standard location for the distro. There is also a minimum requirement for doc (don't you just hate not having a man page?!?).
- Maintenance: This includes security and bug fix updates, and one of the best features of Linux distros is the ease of handling this for all applications.
- Availability of programs: While the established distros have huge inventories of applications, none of them have packages for all applications that run on Linux, including both FOSS and proprietary apps. This is the main point of those who evangelize for universal installers, altho' being able to install newer versions than the distro supplies is related.
I am an advocate for using the distro package managers because they do the best job of meeting all of the above requirements. However, the issue of access to applications not supported by a distro is something that needs to be solved (and as a developer I don't believe that the LSB is up to the task--see my post at SJV's blog).
I believe that the solution is a utility for developers to easily create packages for multiple distros (such as the ESP Package Manager at http://epmhome.org/index.php) and repositories for developer maintained
packages. This needs to be supported by the distros to be sure that the utility creates valid packages. This would allow project teams or vendors to include their applications in the ecosystem of the distro package managers.
Later . . . Jim
The Realeyes IDS, check it out at:
http://realeyes.sourceforge.net
I feel strongly about freedom and diversity in production of GNU/Linux distributions. I also hurt, like so many users do, from being stuck with one set of certain applications at a certain version, for a certain distribution at a certain version. It's the suckiest thing about being a GNU/Linux user by far... in my view.
As a software developer of 15+ years, I've solved many problems and do not see this one as unsolvable. First, if a distribution were to focus on being a good platform for software generally, rather than being platform and application vendor of everything--life could be better.
The platform would focus on having hardware detection, standards (such as LSB and FHS), authentication, networking, and disk management--and support for this. They could focus on the platform as a core competence and let application vendors focus on their respective applications as core competencies. Then, when you get OpenOffice.org (for example), you'd get it with perhaps video tutorials, an online support network, etc.
And, then, a software packaging system could focus on its core competency of producing a universal installer.. The installer should deal with the dependency issues and things like SELinux policies. There can be more than one, but tying to a specific version of a specific distro is just masachistic... albeit typical.
Personally, I'd begin with Autopackage and put a spec file alongside each lib that provides the make options it was compiled with.. That would, actually, be a nice standard to have to have for all distros..
One critical desktop usability point is being missed here and in general by Linux developers - users (a typical statistical representative of the desktop market as a whole) do not (and should not) care about packages. As long as there remains a strong fixation on packages, Linux collectively will never go beyond 3% share of the desktop market.
To illustrate it - is Firefox a package? is Skype a package? is VirtualBox a package? is Eclipse a package?
And if you are starting to argue that they actually are each individually a package...please pause and think again.
One thing I've noticed: the quality of a packaging system is entirely dependent on the human element - how well the repository is maintained, kept consistent, all parts working well together, etc.
apt isn't vastly technically superior to yum or YaST. apt has a good reputation because Debian is superlatively well maintained - all that bureaucracy really does pay off. Ubuntu is good because they ride on Debian's superlative quality.
Fink on MacOS X uses apt and it's distinctly mediocre, because the repository isn't very well maintained. yum on Fedora is much better because they maintain a high-quality repository.
It's not the technology - it's how well the repository is maintained by the distro.
Curiously I'm about to ditch OpenSuSE 11 due to utterly horrible package management.
When adding a program, in this case KDE Kate, creates enough dependency hell that going ahead leaves a totally fsked system then either the package manager is faulty (there is more than enough evidence of that because all the conflicts could have been solved if YaST actually checked is on its repositories) or the release isn't fully baked.
Package management and YAST has always been SuSE's weakest point and it's sad to see that it still is.
ttfn
John
Vi, I have stopped and thought again, then thought some more, and I still can't for the life of me figure out what the point you're trying to make is.
Do you think you could be kind enough to tell us, instead of insisting that we guess?
There's a proposal in the Linux Standards Base for a standard universal installer for Linux.
That's where he started off and the question he asked is is there a need for another one.
I'm not sure he ever answered that.
SUSE Build comes close in some ways though, come to read it again and again, as you have, he's confusing package management with packaging systems.
(Of course he could be talking about rpm or deb or the smaller ones out there, I'm just not sure.)
In the end the post goes from packaging systems to package management and then off somewhere else. The discussion seems to be taking the same tack and I contributed to the tacking. (Me bad!)
The question is valid, of course, but I have no idea where he's trying to go with it or what his answer is.
ttfn
John
I agree that the packaging mess is not an unsolvable problem. If you believe that you're being a defeatist, since anything is solvable.
It is always possible to add some kind of layer of common communication to a system in order to make it modular/standardized. There are all sorts of web browsers, but they all use (for the most part) standardized HTML/CSS/etc. Web creators are still free to create content for the internet using whatever programs they want. My point is that there can always be *some* point, either hidden or visible to packagers, developers, and end users, at which you can decide to create a standard of communication. This badly needs to be done. Linux needs to be made modular and free in every way, so that users, developers, and everyone can become free, instead of being tied down to software stack XYZ just because they chose distro W. IMO, if things were modular enough, all distros would be, and should be, is a certain collection of packages and default configurations.
For example, take the video container format MKV. This is not restrictive in any way, because you can throw DivX, XVid, Ogg, MP4, and other formats into this container. It acts as a compatibility "wrapper" to some degree. In the same sense, what if you could make a LNX, LINUX, LX or whatever packaging container format, and inside that be able to put RPMs, DEBs, etc. But, getting away from package formats, all package managers should be compatible with a standard, and should be able to handle standard packages. I'd like to see the package manager be as unimportant as my preference between Opera and Firefox. I pick one over the others because of it's speed and efficiency, not because it's compatible with a certain group of programs.
When this gets solved, it will be a major boon to Linux and it's adoption rate will sore much faster. Once it becomes easy to store, trade, share, copy, transfer, and install Linux software packages regardless of the trivial preference of anyone's choice of distro and distro version.
I also feel I should note that it may not be in the best interests of a companies behind distros to see programs be accessible by all Linux users. They do have the motive for doing this, so that they can show off their own repository of programs as a perk of choosing their distro. If Linux is to be successful though, they must work together on this instead of against one another, so I hope that, for now at least, they still care about solving this problem.
1) You wrote:
In creating nearly all packages, there is always some amount of shell scripting that needs to be done.
Yes, this is historically what is being done. However the more I think about it, the more problematic I see it. Why are there during installation things that has to be done by shell scripting??? What are these things?
Installing a package means to copy its files in the right places (binaries, docs, images, launchers, etc.), create/update config files, restart services. All these things could be done by the installer, and the pre-inst and post-inst scripts IMO hide the deficiencies of the package manager and/or package format.
For Java there is the "ant" tool, and this way for compilation the tested, proven and versatile (but old) makefiles are not needed. And the ant's build.xml file is, well, XML. Even more, ant is extendable by providing it with new "actions" so it it not limited to a single set of XML tags. This lets me have a confidence it would be doable to use XML for package management, so please don't spread misinformation and prejudices here.
Even more:
Many packages require some form of post-install and pre-uninstall scripting to be done. The most universal method for doing this on any distribution is a script.
Yes, you're right. However, if the package managers would run such a script providing it with the settings like paths to config files, description of package dependencies, and other things that differ between distributions today - this would be an even more powerful mechanism. And at this point I would agree that there is a place for shell scripting (which I like and use quite often in my everyday work).
2) You mentioned a myth:
The user should be able to get a package from just about anywhere, and it should run automatically on their Linux system.
I fail to consider this a myth. Either we want uniformization of Linux environment or we don't. Either it is a good thing to expand Linux penetration of the market or it's bad. If these goals are to be met, this is one of the means to achieve these goals.
Why do distributors and distributions exist? Because people have different views on how to do things. Different gcc and toolchain versions, different compiler settings, different locations of config files (think: apache2), different package formats. If there was some common ground in a form of installer API, no need for pre-inst and post-inst scripts and to some extent uniformity of the OS environment, all the above differences would become less important.
Maybe it's time to really think about more cross-distributions standardization?
3) However:
Trying to funnel everything down through some common layer at the bottom really reduces the chances for big innovation to happen.
And here I agree with you 100%. So in fact I'm not completely sure what to do w.r.t. software packaging in Linux. But uniform approach to this problem and uniform solution would surely bring commercial world to Linux. So they would be able to package their software once and it would run in almost every distro. One thing remains to be answered, whether this is really a goal that we would like to reach?
Seems to me that package management systems are trying to solve a problem that should not even exist in the first place. The developers have created a mess that the package managers are attempting to clear up. Much better to do the job properly, get rid of the mess, and then the problem you are trying to solve no longer exists.
Current hard drives have such a huge capacity that there is no problem installing the required libraries locally with each application - so what if there are 10 copies of the same libraries on the hard disk - id rather waste a few gigs out of several hundred than have applications that fail due to library conflicts. So far as I can see the need to share these resources dates back to the time when hard disk space was scarce and expensive. Well they are not now, so lets move on. Do what PC-BSD does and simply package most of the libraries with the apps. This will solve most of the issues. The others can be solved by different means. Package managers are not the real answer.
I would love to see a system where I can download a file and just double click it to install it like you can on Windows or the Mac for that matter. I think that more people would be using Linux if they could this this.
Over the weekend, I've gotten alot of comments. Let me address them.
VI: You are against this fixation on packages? This is also a very real problem for Windows and OS X as well. I'm not sure where your argument is going. Perhaps computers would be easier if the user didn't have to see them at all, but that's also out of this scope.
Packages are a necessity for the enterprise.
David Gerard: Right on. A friend of mine is switching 170 computers over to Ubuntu now, because Fedora failed at packaging Bacula correctly.
Jim Sansing: A universal package tool will still fail for the reasons that every platform is radically different. It's simply not doable under the status quo.
Matthew: How are distros not already good platforms for software? Furthermore, name me one distro that has benefited from having more than just one or two standard repos for getting packages?
RHEL actually does provide a good platform for software in general because it is a) rock solid stable and b) certified for a number of platforms.
John: That's not a problem with Yast, but how the packages declare dependencies on each other. It's something I am critical about for both Fedora and openSuSE, but I doubt will change any time soon.
YFRWLF: Let me poke a couple of holes in your argument. We already have two competing standards for 'package containers', namely RPM and DEB. But woe be the person who tries to install a Fedora RPM on Mandrivia, or an Ubuntu DEB on Mepis. Having a universal container is pointless when the contents are different. The receiver of the contents, the distro, is going to be different across distros. We won't see this utopia of having packages that can work cross distro when each distro is fundamentally different. To do so would require so much more stabilization.
siryes: 1) You propose replacing shell scripting with another language. It's probably the most interesting proposal I've seen, but it has a few challenges. Some packages always have exceptions in how they do things. The package maintainer needs to have some easy way an action can be plugged into the running system, or the maintainer will feel completely restricted.
I'm not sure what is faulty with shell scripting. It's just a language, out of many others. At best, you propose replacing it with another language that can be more easily audited, and perhaps more generically phrased. Such a thing could also be done at the shell script level, through a series of cross platform tools. There are more solutions than just using XML-ish code similar to ant.
Also, this doesn't take into account some very silly cross distro issues. What if a package needs to install a boot up script. It assumes it is going to run the script only in run level 5. What if the distro does not have a run level system at all? How do you draw a comparison? Such an API could feasibly take 10 years to develop, and meanwhile the linux desktop will change three times around.
2 and 3) You really did answer why it is hard to provide some kind of universal packaging. One version of GCC might have been security audited, while the other contains the state of the art optimizations. This means Red Hat and Fedora alone would ship with two different GCC toolchains.
On the otherhand, autoconf is a pretty universal tool, and can be used to compile against both toolchains. In some ways, we are as standardized as we are ever going to get.
Anonymous: You are a chicken. Who are you?
Furthermore, shipping libraries with packages will bloat an OS immensely. So much so, that running Linux in any embedded scenario would be a joke. The OS itself would take up more space than Vista.
Moreover, we would still need some system for getting the software on and off the system, and the package manager could be used to keep track of which files were installed, and which services configured. Embedding libraries does not solve this problem at all.
John: This is something that can probably be done through PackageKit. All that really needs to be done is create a pretty front end. Perhaps some web based container format for storing packages for multiple distros even. openSuSE has been looking into it, with one-click-installer. I have a number of ideas how you could achieve such an effect with the interface. Let me know if you want to hear them.
Een reactie posten