filesystem-based repository access

With the putback of changeset 1895 to the pkg(5) gate, pkg(1) now supports filesystem-based repository access.  Please note that this functionality is not expected to be available in an OS build until after b139.

What changed?

There are no user-visible changes to command syntax or client operation.  The only noticeable change is that any command that accepted a repository URI such as 'http://localhost:10000' now also accepts URIs for filesystem locations such as 'file:/path/to/my/repository'.

For example, if you were creating a new image and had a repository located at '/export/myrepo', you could do this:

  pkg image-create -p file:/export/myrepo /path/to/image

See the pkg(1) man page for more examples.

What does this mean?

This change means that a pkg.depotd server is no longer required for serving package data if a client can access the package repository via the filesystem (e.g. NFS mounts, etc.).

What are the related bug entries?

The following RFEs/Bugs were resolved by changeset 1895:

15762 client support for filesystem-based repository access
10244 caching dictionaries as a class variable prevents multi-image and repo search
14802 ability to have separate read / write download caches
15763 test suite should not start depot server unless necessary
15764 file_manager insert needs to stat() less and return more

What are the potential issues?

Currently, the client will not cache data from a repository that is accessed via the filesystem in the image's download cache (e.g. /var/pkg/download), with the assumption that access to the repository via filesystem is reliable and/or reasonably fast.

In some use case scenarios (such as NFS, SMB, etc.) filesystem-based access may be slow, unreliable, or otherwise unsuitable for client transport.  In those cases, it is advised that http-based access to the repository via pkg.depotd(1m) is used instead.  It is hoped that the client can account for the performance of available filesystem-based resources automatically in the future when retrieving package content.

Search operations performed using a file-based repository can use substantially more memory when a repository contains a large number of packages (such as /dev).  This is because the client is essentially acting as the search server in addition to the client.

To avoid this overhead, the related publisher can be configured with both an http(s) origin and the file-based origin.  When performing search operations, the client will use the http(s) based origin(s) first to perform searches, but check the file-based origin first for all package file retrieval operations.  For example:

  pkg set-publisher -g file:/export/myrepo \\
    -g http://localhost:10000 example.com

Thanks

Many thanks to the individuals who collaborated on, reviewed, or contributed to this project.
Comments:

Ok, this is great news.

However, how does one build a repository in the first place?

Can it pull pkgs from /var/pkg/download?

If I'm behind a firewall can I point clients at an internal machine and have them pull pkgs from that machine?

thanks!,
alan

Posted by Alan Pae on May 12, 2010 at 02:15 PM CDT #

You create a repository and publish packages using the pkgsend command currently. See pkgsend(1) for more information about that.

As for pulling pkgs from /var/pkg/download; that directory only contains package files, not package manifests or repository catalogs. There is a webrev out for review right now that would allow clients to share data in /var/pkg/download as a mirror, but what you're talking about is not yet implemented.

Posted by Shawn Walker on May 12, 2010 at 03:54 PM CDT #

Post a Comment:
Comments are closed for this entry.
About

user12609878

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today