callbacks results from the CPAN

Dyn
%//////////////////////////////////////////////////////////////////////////////
%
% Copyright (c) 2007-2022 Daniel Adler <dadler@uni-goettingen.de>, 
%                         Tassilo Philipp <tphilipp@potion-studios.com>
%
% Permission to use, copy, modify, and distribute this software for any
% purpose with or without fee is hereby granted, provided that the above
% copyright notice and this permission notice appear in all copies.
%
% THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
% WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
% MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
% ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
% WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
% ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
% OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
%
%//////////////////////////////////////////////////////////////////////////////

\subsection{MIPS32 Calling Conventions}

\paragraph{Overview}

Multiple revisions of the MIPS Instruction set exist, namely MIPS I, MIPS II, MIPS III, MIPS IV, MIPS32 and MIPS64.
Nowadays, MIPS32 and MIPS64 are the main ones used for 32-bit and 64-bit instruction sets, respectively.\\
Given MIPS processors are often used for embedded devices, several add-on extensions exist for the MIPS family, for example: 

\begin{description}
\item [MIPS-3D] simple floating-point SIMD instructions dedicated to common 3D tasks.
\item [MDMX] (MaDMaX) more extensive integer SIMD instruction set using 64 bit floating-point registers.
\item [MIPS16e] adds compression to the instruction stream to make programs take up less room (allegedly a response to the THUMB instruction set of the ARM architecture).
\item [MIPS MT] multithreading additions to the system similar to HyperThreading.
\end{description}

Unfortunately, there is actually no such thing as "The MIPS Calling Convention". Many possible conventions are used
by many different environments such as \emph{O32}\cite{MIPSo32}, \emph{O64}\cite{MIPSo64}, \emph{N32}\cite{MIPSn32/n64}, \emph{N64}\cite{MIPSn32/n64}, \emph{EABI}\cite{MIPSeabi} and \emph{NUBI}\cite{MIPSnubi}.\\

\paragraph{\product{dyncall} support}

Currently, dyncall supports for MIPS 32-bit architectures the widely-used O32 calling convention (for all four combinations of big/little-endian, and soft/hard-float targets),
as well as EABI (little-endian/hard-float, which is used on the Homebrew SDK for the Playstation Portable). \product{dyncall} currently does not support MIPS16e
(contrary to the like-minded ARM-THUMB, which is supported). Both, calls and callbacks are supported.


\clearpage


\subsubsection{MIPS EABI 32-bit Calling Convention}

% This is about hardware floating point targtes, there are also softfloat ones @@@

\paragraph{Register usage}

\begin{table}[h]
\begin{tabular*}{0.95\textwidth}{lll}
Name                                   & Alias                & Brief description\\
\hline
{\bf \$0}                              & {\bf \$zero}         & hardware zero, scratch \\
{\bf \$1}                              & {\bf \$at}           & assembler temporary, scratch \\
{\bf \$2-\$3}                          & {\bf \$v0-\$v1}      & integer results, scratch \\
{\bf \$4-\$11}                         & {\bf \$a0-\$a7}      & integer arguments, or double precision float arguments, scratch \\
{\bf \$12-\$15,\$24}                   & {\bf \$t4-\$t7,\$t8} & integer temporaries, scratch \\
{\bf \$25}                             & {\bf \$t9}           & integer temporary, address of callee for PIC calls (by convention), scratch \\
{\bf \$16-\$23}                        & {\bf \$s0-\$s7}      & preserve \\
{\bf \$26,\$27}                        & {\bf \$kt0,\$kt1}    & reserved for kernel \\
{\bf \$28}                             & {\bf \$gp}           & global pointer, preserve \\
{\bf \$29}                             & {\bf \$sp}           & stack pointer, preserve \\
{\bf \$30}                             & {\bf \$s8/\$fp}      & frame pointer (some assemblers name it \$fp), preserve \\
{\bf \$31}                             & {\bf \$ra}           & return address, preserve \\
{\bf hi, lo}                           &                      & multiply/divide special registers \\
{\bf \$f0,\$f2}                        &                      & float results, scratch \\
{\bf \$f1,\$f3,\$f4-\$f11,\$f20-\$f23} &                      & float temporaries, scratch \\
{\bf \$f12-\$f19}                      &                      & single precision float arguments, scratch \\
\end{tabular*}
\caption{Register usage on MIPS32 EABI calling convention}
\end{table}

\paragraph{Parameter passing}

\begin{itemize}
\item Stack grows down
\item Stack parameter order: right-to-left
\item Caller cleans up the stack
\item first 8 integers (\textless=\ 32bit) are passed in registers \$a0-\$a7
\item first 8 single precision floating point arguments are passed in registers \$f12-\$f19
\item 64-bit stack arguments are always aligned to 8 bytes
\item 64-bit integers or double precision floats are passed in two general purpose registers starting at an even register number, skipping one odd register
\item if either integer or float registers are used up, the stack is used
\item if the callee takes the address of one of the parameters and uses it to address other unnamed parameters (e.g. varargs) it has to copy - in its prolog - the the argument registers to a reserved stack area adjacent to the other parameters on the...
\item float registers don't seem to ever need to be saved that way, because floats passed to an ellipsis function are promoted to doubles, which in turn are passed in a? register pairs, so only \$a0-\$a7 are need to be spilled
\item aggregates (struct, union) \textless=\ 32bit are passed like an integer
\item {\it non-trivial} C++ aggregates (as defined by the language) of any size, are passed indirectly via a pointer to a copy of the aggregate
\item all other aggregates (struct, union) are passed indirectly, as a pointer to a copy (if needed, and for vararg arguments required to be copied by the caller) of the struct
\end{itemize}

\paragraph{Return values}

\begin{itemize}
\item results are returned in \$v0 (32-bit), \$v0 and \$v1 (64-bit), \$f0 or \$f0 and \$f2 (2 $\times$ 32 bit float e.g. complex)
\item for {\it non-trivial} C++ aggregates, the caller allocates space, passes pointer to it to the callee as a hidden first param
(meaning in \%a0), and callee writes return value to this space; the ptr to the aggregate is returned in \%v0
\item aggregates (struct, union) \textless=\ 64bit are returned like an integer (aligned within the register according to endianness)
( run in 2.546 seconds using v1.01-cache-2.11-cpan-39bf76dae61 )