Introduction

FUSE is a framework to develop filesystems at user-space thereby allowing secure, non-privileged mounts. The filesystems developed on this framework are often custom-made, suitable for a few particular scenarios. This provides applications with flexibility on how to store and retrieve data on a filysystem. The framework contains both userspace as well as kernel components. This topic throws some light on the FUSE Kernel module and libfuse layers. We will be looking at what the important data structures are and their usage in the IO path.

Overview of fuse architecture

Below is a simplistic view of the FUSE architecture:

FUSE Architecture

User application(s) access the FUSE filesystem through glibc (system calls) in a similar way as in any other filesystem. The Virtual File System (VFS: https://docs.kernel.org/filesystems/vfs.html) layer also routes these requests to the FUSE module in the existing/familiar way. The similarities with the traditional filesystems end here. The FUSE Kernel module puts the requests in a queue and wakes up FUSE User Level Daemon thread(s) from a waitq. The user level daemon uses the framework in libfuse to get the requests, executes the requests and sends the response back to the FUSE Kernel level through libfuse and glibc layers. From the FUSE module, the response goes to the User application via VFS, again, in the similar way as traditional filesystems, with the existing mechanisms.

The FUSE User level daemon acts as a pseudo backing device for the FUSE filesystem. This daemon provides the services to the user applications. It interacts with libfuse in the front-end to receive the fuse requests and in the back-end it interacts with the actual backing device which could again be a server daemon on the same or remote node. For example, in case of fuse-sshfs https://github.com/libfuse/sshfs, the user level daemon uses ssh to a remote node so that a configured directory on that node can be accessed via fuse. Another example is the Gluster https://www.gluster.org/ native client that uses FUSE to mount a gluster volume. Hence different fuse based filesystems use their own specific methods in the User level daemon to interact with the actual backing device. Details of the user level daemon are implementation specific and different for different filesystems and out of scope for this document.

Libfuse

The Libfuse library is a framework that provides a reference implementation to communicate with the FUSE kernel module. Using libfuse, developers can quickly develop a user space background daemon. It leaves just a set of callback functions in struct fuse_operations to be developed by the filesystem and everything else is handled by the library. This struct fuse_operations is the set of call back functions where the requests are to be either routed to a remote server as in the case of sshfs or to a local filesystem, as may be the design of the userspace filesystem. This forms the crux of the fuse filesystem. Apart from this, libfuse provides two sets of interfaces or APIs, one for synchronous requests, also called as high-level APIs and one for asynchronous requests also called as low-level APIs. Apart from that, libfuse also provides functions to mount and unmount the filesystem. The mount and unmount functions eventually spawn fusermount binary that comes as part of the fuse rpm.

With libfuse, the userspace FUSE filesystem is required to implement functions in the fuse_operations structure and provide an instance of this structure to a call to fuse_main(). Definition of fuse_operations is provided below to give an idea what is required of developers to come up with a FUSE filesystem. This is akin to struct file_operations structure in regular filesystems. The fuse_operations structure contains pointers to the following functions that are required to be implemented by the filesystem.

getattr()
readlink()
mknod()
mkdir()  
unlink()   
rmdir()  
symlink() 
rename()
link()  
chmod()  
chown()
truncate()  
open()
read()
write()  
statfs()
flush()  
release()  
fsync()  
setxattr()  
getxattr()  
listxattr()  
removexattr()  
opendir()
readdir()  
releasedir()  
fsyncdir()  
init()
destroy()   
access()  
create()
lock()
utimens()  
bmap()  
ioctl()  
poll()  
write()  
read()
flock()
fallocate()  

More detailed explanation about the above functions is provided in the source code itself, as part of the structure definition in https://github.com/libfuse/libfuse/blob/master/include/fuse.h.

An example is provided in example/hello.c in libfuse source on how to use this structure to develop a custom fuse based filesystem. hello.c: https://github.com/libfuse/libfuse/blob/master/example/hello.c It takes the name of a single file and the contents of that file in the command line itself and initializes a structure below.

/*
 * Command line options
 *
 * We can't set default values for the char* fields here because
 * fuse_opt_parse would attempt to free() them when the user specifies
 * different values on the command line.
 */
static struct options {
    const char *filename;
    const char *contents;
    int show_help;
} options;

In the main() function it fills up default values for the file name and its contents and parses options and finally calls fuse_main().

fuse_main(args.argc, args.argv, &hello_oper, NULL);

hello.c implements the following filesystem operations:

static const struct fuse_operations hello_oper = {
    .init     = hello_init,
    .getattr    = hello_getattr,
    .readdir    = hello_readdir,
    .open         = hello_open,
    .read         = hello_read,
};
  • init(): Initializing the features of the filesystem.
  • getattr() and setattr(): Get and set hard coded values for the file.
  • readdir(): Just display the contents of the root that is / of the filesystem. It would have just the given/default file.
  • open(): Open the file.
  • read(): Contents of the file initialized in main().

The implementation of hello.c is totally rudimentary, just to give an idea to the reader so that the reader/developer gets a hang of what needs to be done to have a filesystem up and running.

The high-level and low-evel APIs are part the internal implementations of libfuse layer. fuse_lowlevel_ops structure is defined in fuse_lowlevel.h. This structure is akin to struct inode_operations structure in the regular filesystems. Each function is self explanatory and contains a detailed explanation as part of the definition itself. This is used by the libfuse layer itself. The file fuse_lowlevel.c contains the implementation of low-level APIs and fuse.c implements the high-level APIs.

FUSE Kernel Module

There are three kernel modules in total that are related to fuse.

  1. fuse – This is the module that implements stackable filesystem and provides most of the functionalities for the user space filesystems such as sshfs, Moosefs, s3fs-fuse(Amazon Over FUSE).

  2. fuseblk – This is currently used to mount ntfs partitions as userspace filesystems on Linux.

  3. fusectl – This module provides a sysfs interface at /sys/fs/fuse/connections that can be used to configure a few fuse fs parameters. Every mounted fuse filesystem is provided with a directory entry under /sys/fs/fuse/connections. The entry looks like:

    [root@gmananth-20230327-1240 connections]# pwd
    /sys/fs/fuse/connections
    
    [root@gmananth-20230327-1240 connections]# ls
    50
    
    [root@gmananth-20230327-1240 connections]# ls -l 50
    total 0
    --w------- 1 root root 0 Aug 29 11:17 abort
    -rw------- 1 root root 0 Aug 29 11:17 congestion_threshold
    -rw------- 1 root root 0 Aug 29 11:17 max_background
    -r-------- 1 root root 0 Aug 29 11:17 waiting

Some important data structures of the FUSE kernel layer is discussed below.

FUSE Mount

The struct fuse_mount structure is the top most level data structure and is defined in the header file https://github.com/torvalds/linux/blob/master/fs/fuse/fuse_i.h.

/*
 * Represents a mounted filesystem, potentially a submount.
 *
 * This object allows sharing a fuse_conn between separate mounts to
 * allow submounts with dedicated superblocks and thus separate device
 * IDs.
*/
struct fuse_mount {
    /* Underlying (potentially shared) connection to the FUSE server */
    struct fuse_conn *fc;

    /*
     * Super block for this connection (fc->killsb must be held when
     * accessing this).
    */
    struct super_block *sb;

    /* Entry on fc->mounts */
    struct list_head fc_entry;
    struct rcu_head rcu;
};

This structure is created during a mount operation in fuse_get_tree() called from do_new_mount()->vfs_get_tree() or vfs_kern_mount()->fc_mount()→vfs_get_tree().

This structure can be considered as the entry point to the fuse layer in the kernel, a pointer to this structure is stored in vfs superblock at sb->s_fs_info.

FUSE Connection

The struct fuse_conn is the most important structure in the FUSE layer at the kernel level. It sits in both the control as well as the IO path. The fuse_conn refers to the communication channel between the user daemon and the actual backend device or the backend server (as the case may be). struct fuse_conn tracks activities on this channel from the kernel point of view. It is a hugh structure containing a lot of information required for the control and IO paths. Some importan members of this structure are described here. This structure is defined in the header file https://github.com/torvalds/linux/blob/master/fs/fuse/fuse_i.h.

  • initialized: Whether the connection is initialized.

  • blocked: Whether the connection is blocked.

  • connected: Whether the connection has been established with the background FUSE server

  • aborted: Whether the connection is aborted.

  • max_background: The maximum number of outstanding FUSE requests that can be in the background queue. These requests are yet to be submitted.

  • congestion_threshold: The number of background requests at which congestion starts. This number is less than max_background.

  • num_background: The current number of outstanding FUSE requests that can be in the background queue.

  • active_background: These are the FUSE requests that have been submitted but have not completed. Basically this is the number of outstanding IO with respect to the FUSE layer.

    However with respect to the user, the total outstanding IO is active_background + num_background.

  • bg_queue: This is the queue for background FUSE requests. The function fuse_request_queue_background() adds FUSE requests to this queue.

  • iq: This is the input queue for active FUSE requests.

Fuse Input Queue

This queue is contained in the struct fuse_conn structure. This is the member called iq. This structure is defined in the header file https://github.com/torvalds/linux/blob/master/fs/fuse/fuse_i.h.

struct fuse_iqueue {
    /** Connection established */
    unsigned connected;

    /** Lock protecting accesses to members of this structure */
    spinlock_t lock;

    /** Readers of the connection are waiting on this */
    wait_queue_head_t waitq;

    /** The next unique request id */
    u64 reqctr;

    /** The list of pending requests */
    struct list_head pending;

    /** Pending interrupts */
    struct list_head interrupts;

    /** Queue of pending forgets */
    struct fuse_forget_link forget_list_head;
    struct fuse_forget_link *forget_list_tail;

    /** Batching of FORGET requests (positive indicates FORGET batch) */
    int forget_batch;

    /** O_ASYNC requests */
    struct fasync_struct *fasync;

    /** Device-specific callbacks */
    const struct fuse_iqueue_ops *ops;

    /** Device-specific state */
    void *priv;
};
  • waitq: This is the waitq for the userspace daemon that serves the FUSE requests.
  • pending: This is the actual queue for the FUSE requests that are submitted to the background daemon. The number of items in this list constitutes the active_background. Fuse requests are taken from bg_queue list in the fuse_conn and put into this list and the processes in the waitq are woken up to read the requests. This is done in the function queue_request_and_unlock() which is eventually called in the call stack of read_iter(), splice_read(), write_iter() and splice_write() interfaces.
  • fasync: This is a list and the fasync() interface in struct file_operations interfaces adds elements into this list.

FUSE Request

This is defined by the struct fuse_req structure. The structure is defined in the header file https://github.com/torvalds/linux/blob/master/fs/fuse/fuse_i.h.

struct fuse_req {
    /** This can be on either pending processing or io lists in
    fuse_conn */
    struct list_head list;

    /** Entry on the interrupts list */
    struct list_head intr_entry;

    /* Input/output arguments */
    struct fuse_args *args;

    /** refcount */
    refcount_t count;

    /* Request flags, updated with test/set/clear_bit() */
    unsigned long flags;

    /* The request input header */
    struct {
        struct fuse_in_header h;
    } in;

    /* The request output header */
    struct {
        struct fuse_out_header h;
    } out;

    /** Used to wake up the task waiting for completion of request*/
    wait_queue_head_t waitq;

    #if IS_ENABLED(CONFIG_VIRTIO_FS)
        /** virtio-fs's physically contiguous buffer for in and out args */
        void *argbuf;
    #endif

    /** fuse_mount this request belongs to */
    struct fuse_mount *fm;
};

This is allocated in functions fuse_simple_request() and fuse_simple_background(). The former one would eventually enqueue the fuse_req into fuse_iqueue.pending, the submitted/active request queue and the later one puts the requests into bg_queue, the background request queue. It gets moved to the submitted queue fuse_iqueue.pending later. The requests that are submitted to/read by the userspace background daemon are put inside a fuse_dev structure indicating that the requests have been submitted to the backend device.

    /**_
     * Fuse device instance
     */
    struct fuse_dev {
        /** Fuse connection for this device */
        struct fuse_conn *fc;

        /** Processing queue */
        struct fuse_pqueue pq;

        /** list entry on fc->devices */
        struct list_head entry;
    };_

The struct fuse_pqueue is the structure that holds the requests under execution.

    struct fuse_pqueue {
        /** Connection established */
        unsigned connected;

        /** Lock protecting accessess to members of this structure */
        spinlock_t lock;

        /** Hash table of requests being processed */
        struct list_head *processing;

        /** The list of requests under I/O */
        struct list_head io;
    };

The applications that issued the IO (creating the fuse_req) will be waiting in the fuse_req.waitq and they will be woken by fuse_request_end() indicating the completion of execution of a request.

Conclusion:

The FUSE framework thus contains fuse kernel modules and a libfuse library. Together it enables the quick development of a userspace filesystem that is custom made for some specific application.

References: