Dictionary Definition
thunk n : a dull hollow sound; "the basketball
made a thunk as it hit the rim"
User Contributed Dictionary
English
Pronunciation
-
- Rhymes: -ʌŋk
Etymology 1
By analogy with past tenses and past participles ending in "-unk", such as drunk and sunkVerb form
thunk- past participle of think
- Who would have thunk it?
- Who'd'a thunk it?
- Who would have thunk it?
Etymology 2
OnomatopoeicInterjection
- Representing the sound of the impact of a heavy object striking another and coming to an immediate standstill, with neither object being broken by the impact.
Etymology 3
Claimed by the inventors to be from the supposed
past tense, being coined when they realised after much thought
(whence "thunk") that the type of an argument in ALGOL 60 could
be predetermined at compile
time; not, as is sometimes claimed, from the interjection,
being the supposed sound made by data hitting the stack or an
accumulator
Noun
- In the context of "computing|functional programming": a delayed computation
- In the Scheme programming language, a function or procedure taking no arguments.
- a mapping of machine data from one system-specific form to another, usually for compatibility reasons, such as from 16-bit addresses to 32-bit to allow a 16-bit program to run on a 32-bit operating system
See also
Extensive Definition
- "Thunk" may also be used jocularly as a past tense of the verb "think".
The word thunk has at least three related
meanings in computer science. A "thunk" may be:
- a piece of code to perform a delayed computation (similar to a closure)
- a feature of some virtual function table implementations (similar to a wrapper function)
- a mapping of machine data from one system-specific form to another, usually for compatibility reasons
In all three senses, the word thunk refers to a
piece of low-level code, usually machine-generated, that implements
some detail of a particular software system.
Thunk as delayed computation
Call by name
Implementations of the call by name and call by need evaluation strategies typically use a thunk to refer to the function's argument. In this context, the thunk is simply a computation that, when executed, returns the value (if any) of the argument.In call-by-need, the thunk is replaced by its
return value after its first execution. In languages with late
binding, the "computation" performed by the thunk may include a
lookup in the run-time context of the program to determine the
current binding of a variable.
An early implementation of thunks for
call-by-name was in Algol 60.
begin integer idx; real procedure sum (i, lo, hi,
term); value lo, hi; integer i, lo, hi; real term; comment term is
passed by-name, and so is i; begin real temp; temp := 0; for i :=
lo step 1 until hi do temp := temp + term; sum := temp end; print
(sum (idx, 1, 100, 1/idx)) end
The above example (see Jensen's
Device) relies on the fact that the actual parameters idx and
1/idx are passed "by name", so that the program is equivalent to
the "inlined" version
begin integer idx; real sum; begin real temp;
temp := 0; for idx := 1 step 1 until 100 do temp := temp + 1/idx;
sum := temp end; print (sum) end
Notice that the expression 1/i is not evaluated
at the point of the call to sum; instead, it is evaluated anew each
time the formal parameter term is mentioned in the definition of
sum. A compiler using
thunks to implement call-by-name would process the original code as
if it had been written using function pointers or lambdas
(represented below in Algol-like pseudocode):
begin integer idx; real procedure sum (i_lvalue,
lo, hi, term_rvalue); value lo, hi; integer lo, hi; thunk i_lvalue;
thunk term_rvalue; begin real temp; temp := 0; for i_lvalue() := lo
step 1 until hi do temp := temp + term_rvalue(); sum := temp end;
procedure lvalue_of_idx (); begin lvalue_of_idx := address of idx
end; procedure rvalue_of_1_over_idx (); begin rvalue_of_1_over_idx
:= 1/idx end; print (sum (lvalue_of_idx, 1, 100,
rvalue_of_1_over_idx)) end
The procedures lvalue_of_idx and
rvalue_of_1_over_idx would be generated automatically by the
compiler whenever a call-by-name actual parameter was encountered.
These automatically generated procedures would be called
thunks.
Functional programming
In functional programming, "thunk" is another name for a nullary function — a function that takes no arguments. Thunks are frequently used in strict languages as a means of simulating lazy evaluation; the thunk itself delays the computation of a function's argument, and the function forces the thunk to obtain the actual value. In this context, a thunk is often called a suspension or (in Scheme) a promise.Thunks also arise naturally in other situations,
for example in the implementation of constant
functions, which may be useful in higher-order
programming. In Common Lisp,
constant functions are created by constantly: (constantly 6)
evaluates to a thunk that, when called, always yields the value
6.
Thunks in object-oriented programming
Some compilers for object-oriented languages such as C++ generate functions called "thunks" as an optimization of virtual function calls in the presence of multiple or virtual inheritance. Consider the C++ codestruct A ; struct B ; struct C : public A, public
B ; int use(B *b) ... C c; use(&c); ...
Since the function B::access is virtual, the call
to b->access() requires a vtable dispatch. In naïve
implementations, the dispatch will consist of five steps:
- The object b holds a pointer to the vtable. Load that pointer into a register.
- The vtable entry for B::access is at some known offset in the vtable for B; find that entry E.
- E contains a pointer to a function (in this case, C::access). Load that function pointer C::access.
- Since C::access expects a this pointer to an instance of C, but b is an instance of B, we must decrement b by the offset of B in C (in this example, probably 8 bytes: the size of C::A::value plus the size of C::A's vtable pointer). Since this offset is not known to use at compile time, it must also be loaded from E.
- Finally, call C::access with the adjusted value of b.
The fourth step, in which an offset is loaded
from E and added to b, can be completely eliminated by the
compiler, thus speeding up every virtual method call, if the
compiler generates a wrapper function like this, and places its
address in the vtable entry E:
int thunk_for_C_access_in_B(B *b)
Then the steps for use() become:
- The object b holds a pointer to the vtable. Load that pointer into a register.
- The vtable entry for B::access is at some known offset in the vtable for B; find that entry E.
- E contains a pointer to a function (in this case, C::access). Load that function pointer W.
- Call W with the adjusted value of b. If b was really of dynamic type B, then W = B::access, and so we have saved two instructions (an expensive memory load and a cheap addition). If b was really of dynamic type C, then W = thunk_for_C_access_in_B, and so we have added one instruction (a cheap unconditional branch at the end of thunk_for_C_access_in_B).
Since the particular pattern of multiple
inheritance in class C is rare in practice, we will generally save
more instructions than we add. At the same time, we no longer need
to store an offset for each entry E in the vtable, and so we have
halved the size of every vtable in the program.
The name "thunk" for these compiler-generated
functions is something of a misnomer, since they don't have
anything to do with delaying computation and could have been
described simply as compiler-generated wrapper
functions, but the term "thunk" for these functions is now
quite well established.
References
- Bjarne Stroustrup, "Multiple Inheritance in C++" (C/C++ Users Journal, May 1999). See section 5.1 for the naïve vtable representation.
- Karel Driesen and Urs Hölzle, "The Direct Cost of Virtual Function Calls in C++". (This paper's conclusions are also included in Driesen's Efficient Polymorphic Calls, Springer, 2001.)
Thunk as compatibility mapping
Thunking may also refer to creating a 16-bit
virtual
DOS machine (VDM) within a 32-bit operating platform so that
there is backward compatibility for applications using older code
or system
calls.
OS/2 & Windows 16-bit address hack
A piece of code is executed to provide an address. The most common usage is in the Win16/Win32 API, where thunking is used to convert a 16-bit address into a 32-bit equivalent or vice versa. One ubiquitous early example was "wsock32.dll", a thunking layer added to allow Win32 Internet applications to use the Win16 winsock.dll library originally written for Windows 3.11. Similar thunking was allowed from OS/2 to Win32 code in the Windows NT OS/2 subsystem and this was documented in the Resource Kit for Windows NT versions up to Windows 2000. Similar thunking was required in many cases in OS/2 2.x—while most of the operating system was 32-bit, many parts of the kernel and device drivers were 16-bit for compatibility reasons.A flat thunk consists of a pair of DLLs (one
32-bit and one 16-bit) that are used to translate calls from 32-bit
code to 16-bit code. 16- and 32-bit memory addresses work very
differently: 16-bit addresses consist of two parts, a pointer to a
memory segment, and the offset from the start of that memory
segment; whereas a 32bit process memory pointer consists of the
absolute address of the memory being accessed. To allow the two
dlls to communicate, some intermediate code must be used to
translate memory addresses (pointers) between platforms.