Process-in-Process
new URL: https://github.com/procinproc
This repo is no longer maintained. Refer to the above new repo.
PiP is a user-level library to have the best of the both worlds
of multi-process and multi-thread parallel execution models. PiP
allows a process to create sub-processes into the same virtual address
space where the parent process runs. The parent process and
sub-processes share the same address space, however, each process has
its own variable set. So, each process runs independently from the
other process. If some or all processes agree, then data owned by a
process can be accessed by the other processes. Those processes share
the same address space, just like pthreads, but each process has its
own variables like the process execution model. Hereinafter, the
parent process is called PiP process and sub-processes are called
PiP tasks.
Currently there are three PiP library versions:
Unfortunately each version has unique ABI and there is no ABI
compatibility among them. The functionality of PiP-v1 is almost the
same with PiP-v2, however, PiP-v2’s API is a subset of the PiP-v3’s
API. Hereafter NN denotes the PiP version number.
PiP also provides new thread implementation named “Bi-Level Thread
(BLT)”, again, to take the best of two worlds, Kernel-Level Thread
(KLT) and User-Level Thread (ULT) here. A BLT is a PiP task. When a
PiP task is created it runs as a KLT. At any point the KLT can
becomme a ULT by decoupling the associated kernel thread from the
KLT. The decoupled kernel thread becommes idle. Later, the ULT can
become KLT again by coupling with the kernel thread.
As described, PiP allows PiP tasks to share the same virtual
address space. This mans that a PiP task can context-switch to the
other PiP task at user-level. This is called User-Level Process
where processes may be derived from the same program or different
programs. Threads basically share most of the kernel resources,
such as address space, file descriptors, a process id, and so
on whilst processes do not. Every process has its ows file
descriptor space, for example. When a ULP is scheduled by a KLT
having PID 1000, then the getpid() is called by the ULP
returns 1000. Further, when the ULT is migrated to be scheduled by
the other KLT, then the returned PID is different. So, when
implemnting a ULP system, this systemcall consistency must be
preserved. In ULP on PiP, the consistency can be
maintained by utilizing the above BLT mechanism. When a ULT tries
to call a system call, it is coupled with its kernel thread which
was created at the beginning as a KLT. It should be note that Thread
Local Storage (TLS) regions are also switched when switching ULP (and
BLT) contexts.
There are several PiP implementation modes which can be selected at
the runtime. These implementations can be categorized into two;
In the pthread mode, although each PiP task has its own static
variables unlike thread, PiP task behaves more like PThread, having a TID,
having the same file descriptor space, having the same signal
delivery semantics as Pthread does, and so on.
In the process mode, a PiP task behaves more like a process, having
a PID, having an independent file descriptor space, having the same
signal delivery semantics as Linux process does, and so on. The
above mentioned ULP can only work with the process mode.
When the PIP_MODE environment variable set to “thread”
then
the PiP library runs in the pthread mode, and if it is set to
“process” then it runs in the process mode. There are also three
implementations in the process mode; “process:preload,”
“process:pipclone” and “process:got.” The “process:preload” mode
must be with the LD_PRELOAD environment variable setting so that
the clone() system call wrapper can work with. The
“process:pipclone” mode is only effective with the PIP-patched
glibc library (see below).
Several function are made available by the PiP library to absorb the
functional differences due to the execution modes.
This package is licensed under the 2-clause simplified BSD License -
see the LICENSE file for details.
Basically PiP requires the following three software packages (new links);
By using PiP-glibc, users can create up to 300 PiP tasks which can be
debugged by using PiP-gdb. In other words, without installing
PiP-glibc, users can create up to around 10 PiP tasks (the number
depends on the program) and cannot debug by using PiP-gdb.
There are several ways to install the PiP packages; Yum (RPM), Docker,
Spack, and building from the source code. It is strongly recommended
to use the following PiP package installation program (pip-pip):
This is the easiest way to install PiP packages in any form. Here is
the example of pip-pip
usage:
$ git clone https://github.com/procinproc/PiP-pip.git
$ cd PiP-pip
$ ./pip-pip --how=HOW --pip=PIP_VERSION --work=BUILD_DIR --prefix=INSTALL_DIR
HOW can be one of yum
, docker
, spack
and github
, or any
combination of them. pip-pip --help
will show you how to use the
program. yum
, docker
and spack
include all three packages;
PiP-glibc, PiP-lib, and PiP-gdb.
The following PiP documents are created by using Doxygen.
Man pages will be installed at PIP_INSTALL_DIR/share/man.
$ man -M PIP_INSTALL_DIR/share/man 7 libpip
Or, use the pip-man command (from v2).
$ PIP_INSTALL_DIR/bin/pip-man 7 libpip
The above two exammples will show you the same document you are reading.
PDF documents will be installed at
PIP_INSTALL_DIR/share/pdf.
HTML documents will be installed at
PIP_INSTALL_DIR/share/html.
You can use pipcc(1) command to compile and link your PiP programs.
$ pipcc -Wall -O2 -g -c pip-prog.c
$ pipcc -Wall -O2 -g pip-prog.c -o pip-prog
Let’s assume that you have a non-PiP program(s) and wnat to run as PiP
tasks. All you have to do is to compile your program by using the above
pipcc(1) command and to use the pip-exec(1) command to run your program
as PiP tasks.
$ pipcc myprog.c -o myprog
$ pip-exec -n 8 ./myprog
$ ./myprog
In this case, the pip-exec(1) command becomes the PiP root and your program
runs as 8 PiP tasks. Note that the ‘myprog.c’ may or may not call any
PiP functions. Your program can also run as a normal program (not as a
PiP task) without using the pip-exec(1) command. In either case, your
proghrams must be compiled and linked by using the pipcc(1) command
described above.
You may write your own PiP programs whcih includes the PiP root programming.
In this case, your program can run without using the pip-exec(1) command.
If you get the following message when you try to run your program;
PiP-ERR(19673) './myprog' is not PIE
Then this means that the ‘myprog’ (having PID 19673) is not compiled
by using the pipcc(1) command properly. You may check if your
program(s) can run as a PiP root and/or PiP task by using the
pip-check(1) command (from v2);
$ pip-check a.out
a.out : Root&Task
Above example shows that the ‘a.out’ program can run as a PiP root and PiP tasks.
pips(1) command (from v2)
Similar to the Linux ps
command, you can see how your PiP program(s)
is (are) running by using the pips(1)
command. pips
can accept
‘a’, ‘u’ and ‘x’ options just like the ps
command.
$ pips [a][u][x] [PIPS-OPTIONS] [-] [PATTERN ..]
List the PiP tasks via the ‘ps’ command;
$ pips —ps [ PATTERN .. ]
or, show the activities of PiP tasks via the ‘top’ command;
$ pips —top [ PATTERN .. ]
Additionally you can kill all of your PiP tasks by using the same pips(1) command;
$ pips -s KILL [ PATTERN .. ]
The following procedure attaches all PiP tasks and PiP root which
created those tasks. Each PiP task is treated as a GDB inferior
in PiP-gdb. Note that PiP-glibc and PiP-gdb packages are required to
do this.
$ pip-gdb
(pip-gdb) attach PID
The attached inferiors can be seen by the following GDB command:
(pip-gdb) info inferiors
Num Description Executable
4 process 6453 (pip 2) /somewhere/pip-task-2
3 process 6452 (pip 1) /somewhere/pip-task-1
2 process 6451 (pip 0) /somewhere/pip-task-0
* 1 process 6450 (pip root) /somewhere/pip-root
You can select and debug an inferior by the following GDB command:
(pip-gdb) inferior 2
[Switching to inferior 2 [process 6451 (pip 0)] (/somewhere/pip-task-0)]
When an already-attached program calls ‘pip_spawn()’ and becomes
a PiP root task, the newly created PiP child tasks aren’t attached
automatically, but you can add empty inferiors and then attach
the PiP child tasks to the inferiors.
e.g.
.... type Control-Z to stop the root task.
^Z
Program received signal SIGTSTP, Stopped (user).
(pip-gdb) add-inferior
Added inferior 2
(pip-gdb) inferior 2
(pip-gdb) attach 1902
(pip-gdb) add-inferior
Added inferior 3
(pip-gdb) inferior 3
(pip-gdb) attach 1903
(pip-gdb) add-inferior
Added inferior 4
(pip-gdb) inferior 4
(pip-gdb) attach 1904
(pip-gdb) info inferiors
Num Description Executable
* 4 process 1904 (pip 2) /somewhere/pip-task-2
3 process 1903 (pip 1) /somewhere/pip-task-1
2 process 1902 (pip 0) /somewhere/pip-task-0
1 process 1897 (pip root) /somewhere/pip-root
You can attach all relevant PiP tasks by:
$ pip-gdb -p PID-of-your-PiP-program
(from v2)
If the PIP_GDB_PATH environment is set to the path pointing to PiP-gdb
executable file, then PiP-gdb is automatically attached when an
excetion signal (SIGSEGV or SIGHUP by default) is delivered. The
exception signals can also be defined by
setting the PIP_GDB_SIGNALS environment. Signal names (case
insensitive) can be
concatenated by the ‘+’ or ‘-‘ symbol. ‘all’ is reserved to specify most
of the signals. For example, ‘ALL-TERM’ means all signals excepting
SIGTERM, another example, ‘PIPE+INT’ means SIGPIPE and SIGINT. If
one of the specified or default signals is delivered, then PiP-gdb will be
attached automatically. The PiP-gdb will show backtrace by default. If users
specify PIP_GDB_COMMAND, a filename containing some GDB
commands, then those GDB commands will be executed by PiP-gdb in batch
mode. If the PIP_STOP_ON_START environment is
set, then the PiP library delivers SIGSTOP to a spawned
PiP task which is about to start user program. If its value is a
number in decimal, then the PiP task whose PiP-ID is the same with the
specified number will be stopped. If the number is minus, then all
PiP tasks will be stopped at the very beginning. Do not forget to
compile your programs with a debug option.
If you have any questions or comments on PiP, send e-mails to;
procinproc-info@googlegroups.com
Or, join our PiP users’ mailing list;
procinproc-users@googlegroups.com
Atsushi Hori, Min Si, Balazs Gerofi, Masamichi Takagi, Jay Dayal,
Pavan Balaji, and Yutaka Ishikawa. “Process-in-process: techniques for
practical address-space sharing,” In Proceedings of the 27th
International Symposium on High-Performance Parallel and Distributed
Computing (HPDC ‘18). ACM, New York, NY, USA, 131-143. DOI:
https://doi.org/10.1145/3208040.3208045
Atsushi Hori, Balazs Gerofi, and Yuataka Ishikawa. “An Implementation
of User-Level Processes using Address Space Sharing,” 2020 IEEE
International Parallel and Distributed Processing Symposium Workshops
(IPDPSW), New Orleans, LA, USA, 2020, pp. 976-984, DOI:
https://doi.org/10.1109/IPDPSW50202.2020.00161.
Kaiming Ouyang, Min Si, Atsushi Hori, Zizhong Chen, and Pavan
Balaji. 2020. “CAB-MPI: exploring interprocess work-stealing towards
balanced MPI communication,” In Proceedings of the International
Conference for High Performance Computing, Networking, Storage and
Analysis (SC ‘20). IEEE Press, Article 36, 1–15.
Atsushi Hori
National Institute for Informatics
(formerly Riken Center for Commputational Science)
Japan