Categories: Development

Ideas for future package management

1. Attribute handling & package meta data
1.1. Attributes
1.1.1. Attribute hooks
1.2. Meta data
1.3. Implementation notes
2. Common functionality
2.1. From pkgutils
2.2. From ports
2.3. From prt-get
2.4. List of things a library could/should provide
2.5. pkgdb abstraction layer
3. Common configuration
4. Security enhancements
5. Implementation specific items
5.1. symlink handling
5.2. Pkgfile format
5.2.1. Dependencies
6. INSTALL/REJECT feature for pkgadd.conf
7. Post install hooks
8. Forks & rewrites
8.1. Han's fork
8.2. Han's rewrite
9. To rewrite or not
10. Summary
11. Personal goals
11.1. Johannes Winkelmann
11.1.1. Notes
11.1.2. Further documents

There's nothing fundamentally broken with current pkgutils, however quite many ideas developed throughout the last couple of years. This page tries to provide a high level summary of those to get an overview

Note that the features listed here are not necessarily going to go into into pkgutils, so don't be afraid. The developers are destined to keep the original CRUX goals of simplicity by design, however the requirements have changed quite a bit over the years, so some changes are strongly wanted.

1. Attribute handling & package meta data

1.1. Attributes

Currently, the package database contains the package name, version, and stores the files which belong to a package. There are several use cases where more information could be interesting to have, although not necessarily everyone will need it. The idea here (suggested by Per and discussed at CRUXCon 2004) was to introduce a generic mechanism called "attributes" which are integrated in a binary package, and can then at will be moved into the package database; this allows to satisfy the requirements for the following use cases I consider relevant for CRUX:

use case	attribute
plain source based crux	name,version,release,files
binary desktop crux	name,vers,rel,deps,desc,alias
embedded binary crux	name,vers,rel,files
server crux	name,vers,rel,files,alias,md5sum

It's important to make sure the users can't shoot himself too easily, so configuring the required attributes at compile time might be a good approach.

Note that attributes are simply key value pairs, and therefore pkgutils doesn't have to be changed to introduce new attributes (unless it internally depends on their values, of course).

1.1.1. Attribute hooks

Random meta data could be added to packages with hooks, allowing users to build customized packages if needed. Examples for this are host, build user, build time, maintainer, packager, collection (ugh, where might this package come from) etc.

1.2. Meta data

Shipping the meta data with the package seems inevitable for many use cases; this would then also support *-install and *-remove scripts

1.3. Implementation notes

We considered adding a magical /ATTRIBUTES directory to the packages, where attribute data is stored; in the worst case tar -C / would just create this directory, while keeping the advantage of having a very simple package format (easy to construct by hand if needed).

2. Common functionality

2.1. From pkgutils

When pkgutils was originally written, only scripts would use the pkgdb. Nowadays, things are a bit different: tools like prt-get parse the pkgdb, and several tools call pkgutils as a process to obtain infomation. Exporting an API to do things like pkgadd(path), or pkgdb querying would help application developers.

2.2. From ports

ports and prt-get both support diff'ing and listing, however ports sources all Pkgfiles (which is a potential security problem) and doesn't cope very well with duplicate ports. Mark Rosenstand has suggested to merge ports into prt-get, which is probably a reasonable request for users mostly using prt-get to maintain their system; it would allow to do things like caching on updates etc. Another scenario could be to keep ports for the networking part, but get rid of the listing and diff functionality. Finally, a sync command could be added to prt-get to either call ports or the drivers directly.

2.3. From prt-get

prt-get contains some code to handle dependencies, or parse Pkgfiles. It also has code for aliasing which is definitely in the wrong place there. The same holds true for locking, which currently only affects prt-get, but not pkgrm (i.e. when removing a package with pkgrm, it's not unlocked).

2.4. List of things a library could/should provide

library for
- Pkgfile parsing
- ports tree handling
- package db handling
- package installation (i.e. install_package() instead of pkgadd call)
- package db abstraction
- caching

Also, caching you be done right.

2.5. pkgdb abstraction layer

This is one of the features which is probably overdesign, but having a common API for pkgdb access would allow to introduce a DB abstraction layer, allowing people to play with different backends-

3. Common configuration

common prtdir config
- add nickname: prtdir contrib /usr/ports/contrib (possibly reuse name in /etc/ports to make configuration easier)
- "trust" for post-install per collection? Think

4. Security enhancements

Drop root
Sandbox

5. Implementation specific items

5.1. symlink handling

pkginfo -o and pkgadd -u are current broken WRT symlinks

5.2. Pkgfile format

5.2.1. Dependencies

Mark Rosenstand mentions that packages could use "needs:" and "wants:" for required/optional dependencies, and that if (and only if) we want to separate build time deps from run time deps, it might be a good idea to prefix them, like:

# Needs: curl b:scons b:doxygen
# Wants: libgnomecanvas

My (Johannes') take: I don't think the separation is needed right now (and Mark agrees with that :-)), however the prefixing is pretty clean, and since adding meta data to packages would allow to design alternative flavours of CRUX which use binary packages, separation of the two would be a nice thing.

6. INSTALL/REJECT feature for pkgadd.conf

Allow users to omit installation of certain files/directories based on patterns, like this:

INSTALL ^usr/info.*$ NO
INSTALL ^usr/share/bash-completion/*$ YES

In combination with this, our package guidelines should require maintainers to install those files in question to make sure people actually get them. This concept is not meant to replace the no-junk rule, but to solve cases where there's no clearly superious solution (info pages or KDE doc for example).

It should be considered whether this needs to be on a per port basis (i.e. install info pages for selected ports only); one possibility would be to extend the above syntax to accept two regexes:

INSTALL .*     ^usr/info.*$   NO
INSTALL grub   ^usr/info.*$   YES

(last rule matches)

7. Post install hooks

Currently, pkgadd runs ldconfig after installing libraries, which is very useful. There are different cases which could benefit from such hooks, like:

Kernel module: run depmod
tex styles: texhash (IIRC)
GNOME post-install (?)
fc-cache, mkfontdir & friends

The question is whether the concept of running hooks from pkgadd should be extracted, such that you define the following in pkgadd.conf:

HOOK ^*.\.so                handle_shared_lib
HOOK ^/lib/modules/*.\.ko   handle_depmod

The handle scripts would be in a predefined location, like /etc/pkgadd-hooks. This way, packages can provide such hooks, but we still let the user decide whether he/she wants to use them. Also, if you don't like this, you can always use INSTALL ^etc/pkgadd-hooks.* NO :-).

8. Forks & rewrites

8.1. Han's fork

Allows to continue compilations
Enforces non-root building

8.2. Han's rewrite

uses a package dir with meta data per port
Stores rejected files per port, allowing a clean up on pkgrm

9. To rewrite or not

Per wanted to rewrite pkgadd in C, mainly to not link against libstdc++ anymore; porting to http://www.exactcode.de/embeddedSTL/ might be an alternative. Rewriting pkgmk was not planned as far as I (Johannes) know.

Han rewrote everything in sh.

10. Summary

This page lists lots of shiny new ideas, and the one thing we should always ask ourselves is "are we building the next emerge clone?". Every feature we add means added complexity and cost of maintenance. Let's not forget that while looking at the nice use cases, especially those which have no direct customer (for example the binary package CRUX flavours).

Brief listing of discussion points

introduction of meta data in binary packages
introduction of the "attribute" concept in pkgutils
non-root building in pkgmk
symlink handling in pkgadd
ports' & prt-get's redundant functionality
ports -u vs. prt-get sync (boils down to depending on prt-get)
rewriting pkgadd
introducing a libpkg to share common code
build time deps prefixing
new 'keywords' field
architecture "tagging"

11. Personal goals

I'm adding this section since several CRUX developers have different ideas where this should go to, and since I tried to keep the former section free from personal feelings.

11.1. Johannes Winkelmann

I think pkgutils are not broken right now.

My goals are:

keep the original character
move functionality to the right layer (locking, aliases)
avoid duplicate code
get attributes since they're so powerful
- support binary package management better
ultimatively consider to rewrite parts of prt-get as a community effort

I believe that

pkgmk doesn't need to be rewritten at all
a rewrite of pkgadd might be the best approach, since we win attributes, libstdc++ independence and possibly libarchive support; however, the most urgent things (symlink support) can probably be implemented in current pkgutils too.
A common library would make many things easier for prt-get (or its competitors maybe), but is not strictly required at this point in time

11.1.1. Notes

Maybe also a good moment to consider categories/keywords; see http://jw.tks6.net/files/crux/keywords
Same for an arch component, maybe generated by pkgmk, and included in the package name
An open question is how to cope with renamed packages; I'd suggest to keep a list on crux.nu, and provide a tool to check for "old" ports on one's system
2006-04-25 (DanielMueller): I don't like the idea to rewrite pkgutils in C. Isn't C++ the way to go?

11.1.2. Further documents

PkgutilsAttributes

CRUX : Home