If you're running ODEE in a Linux environment, it's a good idea to have a seasoned Linux sysadmin on your team to help manage the operating system. Linux is a great tool as it's generally lean, fast, and highly configurable. Being highly configurable also means that there is some degree of work that has to be put in to tweak your system optimal performance. In this post, I'd like to discuss open file descriptors, which is a topic that is not often discussed during a Documaker implementation (and often never discussed for those that occur on Windows). First, let me tell you why I'm even talking about this. A long-time Documaker customer who is currently converting from Standard Edition to Enterprise Edition (while simultaneously moving all their systems to the cloud) contacted me with an error that was occasionally popping up in their system. They were able to work around this error by restarting the Doc Factory, but were curious as to the nature of the error. The error, in all its glory, manifested as:
Unexpected exception: java.io.IOException: Cannot run program "/var/opt/oracle/u01/data/domains/odee_1/documaker/bin/docfactory_assembler" (in directory "/var/opt/oracle/u01/data/domains/odee_1/documaker/mstrres/dmres"): error=24, Too many open files at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048) at oracle.documaker.processmonitor.process.instance.Instance.reset(Instance.java:1318) at oracle.documaker.processmonitor.process.Process.startInstance(Process.java:195) at oracle.documaker.processmonitor.process.Process.restartInstance(Process.java:311) at oracle.documaker.processmonitor.process.monitors.InstanceMonitor.restart(InstanceMonitor.java:673) at oracle.documaker.processmonitor.process.monitors.InstanceMonitor.run(InstanceMonitor.java:206) Caused by: java.io.IOException: error=24, Too many open files at java.lang.UNIXProcess.forkAndExec(Native Method) at java.lang.UNIXProcess.<init>(UNIXProcess.java:247) at java.lang.ProcessImpl.start(ProcessImpl.java:134) at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029) ... 5 more
The pertinent part is highlighted above: too many open files. What does that even mean? Basically, in Linux (and other related operating systems) a file descriptor is an abstract indicator (a/k/a "handle") used to access an I/O resource, such as a file, pipe, or network socket. You can read more about file descriptors at Wikipedia. Every process that accesses an I/O resource will need a handle to that resource. A software system that accesses many system resources such as databases, network resources, files, or even web applications that have multiple connected users via web browser, will generate many such handles. That's about all you really need to know about file descriptors themselves. So how does this correlate to the error above? The Linux operating system imposes a hard limit on the number of file descriptors available to a given process. The Linux operating system also imposes a soft limit (which can go up the hard limit) and can be managed by the user. To check your particular system, login via shell and issue the following command:
$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 96202
max locked memory (kbytes, -l) 134217728
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 16384
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
The -a parameter displays all limits. As shown above, you can use other parameters to show just the limit you want to see, such as -n, which is the open files limit (and also the limit we are interested in for solving this problem). As you can see, my open files limit is 1,024, which is actually quite small. This means that any process started by my user on this system can have a maximum of 1,024 file handles. By comparison, the recommended file descriptor limit for a user running Oracle DB is 8,192! This particular system runs in a small virtual machine so I have tuned it accordingly. Before we go about changing the limit, we should have some idea what the limit should be and the answer is, as always, it depends. A good starting point is 8,192. In order to properly tune the system, you'll want to run some performance tests and monitor the number of open file handles consumed by Documaker processes, and then tune accordingly. Another option is simply to make the soft limited unlimited, assuming Documaker is the only process running on this machine of any consequence. You should endeavor to work with your sysadmin to determine the appropriate value and approach here. When you're ready to change the soft limit, the recommended approach is to change it at the user level when you execute the Doc Factory startup script. In fact there are probably already lines in this script, which are commented-out, that you can use as a model. To set the limits for docfactory, you can edit the docfactory.sh script in [ODEE_HOME]/documaker/docfactory/bin, and locate the section that looks like this:
# additional necessary environment settings go here
# such as DB2INSTANCE or loading of . .db2profile
# or ORACLE_SID, MQSERVER, JAVAPATH, CLASSPATH, ...
#--------------------------------------------------------
...
#ulimit -c unlimited
#ulimit -s unlimited
#ulimit -m unlimited
#ulimit -d unlimited
ulimit -S -n unlimited
The line in red above should be added, which sets the soft limit to unlimited. This script also prints out the limits during startup, so you should be able to see the effects of your change when starting up Doc Factory. Note that setting the soft limit to unlimited really doesn't set it to unlimited, it sets it to the same as the hard limit. As I stated previously, if Documaker is the only process of consequence running on this machine then setting it to unlimited is probably okay, but if you have other processes running under the same user account, then you will want to perform some testing to determine the appropriate value and limit accordingly. In that case, you'll want to know the hard limit, which you can discover with the following command:
$ ulimit -H -n 65536
Again on my tiny VM system, you can see that the hard limit is 65,536, so when I start up Doc Factory with ulimit -S -n unlimited in place, it should report back the open files limit as 65,536. You can leave it set like this and probably never have another issue, but in case you must set a specific limit lower than the hard limit, then you will have to do some performance testing and take measurements while the system is processing. There are a number of ways to accomplish this using the lsof command, which reports the list of open files. One version of this command will report back a list of the open file handles for a given user, which you can pipe through the wc process to actually count the processes. In my tiny VM system, the oracle user runs all of my Oracle DB, WebLogic, DocFactory, and Docupresentment processes.
$ lsof -u oracle | wc -l 66123
You should be looking at that number right now and wondering why it is higher than the hard limit. Recall that the limit is per process, so you can definitely have more handles counted by lsof than the hard limit has. If you want to narrow down the lsof output to a specific process, you’ll need to know the process ID (pid) that you want to examine. The following command will show you the PIDs for DocFactory, WebLogic, and Docupresentment processes, along with other info just for your knowledge. The first column is the PID. The second and third columns are not relevant to what we are doing, but are just here for clarity. The second column is the process name and the third column is part of the command line used to start the process, so we can verify we are looking at the correct processes. In my example above, the WebLogic processes are bold, Docfactory is italicized, and Docupresentment (IDS) is everything else.
$ ps -ef | grep -e docfactory -e ids | grep -v grep |awk {'print $2 " " $8 " " $12’}
2996 /usr/java/jdk1.8.0_60/bin/java -Dweblogic.Name=AdminServer
4069 /usr/java/jdk1.8.0_60/bin/java -Dweblogic.Name=jms_server1
7823 /usr/java/jdk1.8.0_60/bin/java -Dweblogic.Name=idm_server1
24950 /usr/java/jdk1.8.0_60/bin/java -Dweblogic.Name=dmkr_server1
13563 idsrouter.exe -cp
14482 idsinstance.exe -Dconfig.jndi.name=DMKRConfig
14744 idsinstance.exe -Dconfig.jndi.name=DMKRConfig
21152 idsinstance.exe -Dconfig.jndi.name=DMKRConfig
22731 idsinstance.exe -Dconfig.jndi.name=DMKRConfig
27529 idswatchdog.exe lib/DocucorpStartup.jar
22802 /oracle/odee/documaker/jre/bin/docfactory_supervisor -Djava.endorsed.dirs=/oracle/odee/documaker/docfactory/lib/endorsed
22945 docfactory_batcher -Djava.library.path=/oracle/odee/documaker/bin
22946 /oracle/odee/documaker/bin/docfactory_assembler -nu
22968 /oracle/odee/documaker/bin/docfactory_presenter -nu
22974 docfactory_pubnotifier -Dodee.home=/oracle/odee/documaker/
22984 docfactory_scheduler -Djava.library.path=/oracle/odee/documaker/bin
22985 docfactory_archiver -Dodee.home=/oracle/odee/documaker/
22987 /oracle/odee/documaker/bin/docfactory_distributor -nu
22988 docfactory_receiver -Djava.library.path=/oracle/odee/documaker/bin
22989 docfactory_historian -Djava.library.path=/oracle/odee/documaker/bin
22990 docfactory_identifier -Djava.library.path=/oracle/odee/documaker/bin
22991 docfactory_publisher -Djava.library.path=/oracle/odee/documaker/bin
In this case, the Assembler process has 195 handles currently; as I said, it's a small system. So, with this information you should have a good grasp on how to handle file descriptors (pardon the pun) and should you run into any errors that you can't handle (I can't help myself) please do comment, or seek out assistance via Oracle Support, or the Documaker community. As an aside, I was going to link to an existing blog post in case you needed assistance in setting up ODEE on Linux, and I realized I don't have one — so look for that soon!
