View Source Match Specifications in Erlang
A "match specification" (match_spec) is an Erlang term describing a small
"program" that tries to match something. It can be used to either control
tracing with erlang:trace_pattern/3 or to search for objects in an ETS table
with for example ets:select/2. The match specification in many ways works like
a small function in Erlang, but is interpreted/compiled by the Erlang runtime
system to something much more efficient than calling an Erlang function. The
match specification is also very limited compared to the expressiveness of real
Erlang functions.
The most notable difference between a match specification and an Erlang fun is the syntax. Match specifications are Erlang terms, not Erlang code. Also, a match specification has a strange concept of exceptions:
- An exception (such as
badarg) in theMatchConditionpart, which resembles an Erlang guard, generates immediate failure. - An exception in the
MatchBodypart, which resembles the body of an Erlang function, is implicitly caught and results in the single atom'EXIT'.
Grammar
A match specification used in tracing can be described in the following informal grammar:
- MatchExpression ::= [ MatchFunction, ... ]
- MatchFunction ::= { MatchHead, MatchConditions, MatchBody }
MatchHead ::= MatchVariable |
'_'| [ MatchHeadPart, ... ]MatchHeadPart ::= term() | MatchVariable |
'_'- MatchVariable ::= '$<number>'
MatchConditions ::= [ MatchCondition, ...] |
[]MatchCondition ::= { GuardFunction } | { GuardFunction, ConditionExpression, ... }
BoolFunction ::=
is_atom|is_float|is_integer|is_list|is_number|is_pid|is_port|is_reference|is_tuple|is_map|is_map_key|is_binary|is_bitstring|is_boolean|is_function|is_record|is_seq_trace|'and'|'or'|'not'|'xor'|'andalso'|'orelse'ConditionExpression ::= ExprMatchVariable | { GuardFunction } | { GuardFunction, ConditionExpression, ... } | TermConstruct
ExprMatchVariable ::= MatchVariable (bound in the MatchHead) |
'$_'|'$$'TermConstruct = {{}} | {{ ConditionExpression, ... }} |
[]| [ConditionExpression, ...] |#{}| #{term() => ConditionExpression, ...} | NonCompositeTerm | Constant- NonCompositeTerm ::= term() (not list or tuple or map)
- Constant ::= {
const, term()} GuardFunction ::= BoolFunction |
abs|element|hd|length|map_get|map_size|max|min|node|float|round|floor|ceil|size|bit_size|byte_size|tuple_size|tl|trunc|binary_part|'+'|'-'|'*'|'div'|'rem'|'band'|'bor'|'bxor'|'bnot'|'bsl'|'bsr'|'>'|'>='|'<'|'=<'|'=:='|'=='|'=/='|'/='|self|get_tcw- MatchBody ::= [ ActionTerm ]
ActionTerm ::= ConditionExpression | ActionCall
ActionCall ::= {ActionFunction} | {ActionFunction, ActionTerm, ...}
ActionFunction ::=
set_seq_token|get_seq_token|message|return_trace|exception_trace|process_dump|enable_trace|disable_trace|trace|display|caller|caller_line|current_stacktrace|set_tcw|silent
A match specification used in ets can be described in the following
informal grammar:
- MatchExpression ::= [ MatchFunction, ... ]
- MatchFunction ::= { MatchHead, MatchConditions, MatchBody }
MatchHead ::= MatchVariable |
'_'| { MatchHeadPart, ... }MatchHeadPart ::= term() | MatchVariable |
'_'- MatchVariable ::= '$<number>'
MatchConditions ::= [ MatchCondition, ...] |
[]MatchCondition ::= { GuardFunction } | { GuardFunction, ConditionExpression, ... }
BoolFunction ::=
is_atom|is_float|is_integer|is_list|is_number|is_pid|is_port|is_reference|is_tuple|is_map|is_map_key|is_binary|is_bitstring|is_boolean|is_function|is_record|'and'|'or'|'not'|'xor'|'andalso'|'orelse'ConditionExpression ::= ExprMatchVariable | { GuardFunction } | { GuardFunction, ConditionExpression, ... } | TermConstruct
ExprMatchVariable ::= MatchVariable (bound in the MatchHead) |
'$_'|'$$'TermConstruct = {{}} | {{ ConditionExpression, ... }} |
[]| [ConditionExpression, ...] | #{} | #{term() => ConditionExpression, ...} | NonCompositeTerm | Constant- NonCompositeTerm ::= term() (not list or tuple or map)
- Constant ::= {
const, term()} GuardFunction ::= BoolFunction |
abs|element|hd|length|map_get|map_size|max|min|node|float|round|floor|ceil|size|bit_size|byte_size|tuple_size|tl|trunc|binary_part|'+'|'-'|'*'|'div'|'rem'|'band'|'bor'|'bxor'|'bnot'|'bsl'|'bsr'|'>'|'>='|'<'|'=<'|'=:='|'=='|'=/='|'/='|self- MatchBody ::= [ ConditionExpression, ... ]
Function Descriptions
Functions Allowed in All Types of Match Specifications
The functions allowed in match_spec work as follows:
is_atom,is_boolean,is_float,is_integer,is_list,is_number,is_pid,is_port,is_reference,is_tuple,is_map,is_binary,is_bitstring,is_function- Same as the corresponding guard tests in Erlang, returntrueorfalse.is_record- Takes an additional parameter, which must be the result ofrecord_info(size, <record_type>), like in{is_record, '$1', rectype, record_info(size, rectype)}.'not'- Negates its single argument (anything other thanfalsegivesfalse).'and'- Returnstrueif all its arguments (variable length argument list) evaluate totrue, otherwisefalse. Evaluation order is undefined.'or'- Returnstrueif any of its arguments evaluates totrue. Variable length argument list. Evaluation order is undefined.'andalso'- Works as'and', but quits evaluating its arguments when one argument evaluates to something else thantrue. Arguments are evaluated left to right.'orelse'- Works as'or', but quits evaluating as soon as one of its arguments evaluates totrue. Arguments are evaluated left to right.'xor'- Only two arguments, of which one must betrueand the otherfalseto returntrue; otherwise'xor'returns false.abs,element,hd,length,map_get,map_size,max,min,node,round,ceil,floor,float,size,bit_size,byte_size,tuple_size,tl,trunc,binary_part,'+','-','*','div','rem','band','bor','bxor','bnot','bsl','bsr','>','>=','<','=<','=:=','==','=/=','/=',self- Same as the corresponding Erlang BIFs (or operators). In case of bad arguments, the result depends on the context. In theMatchConditionspart of the expression, the test fails immediately (like in an Erlang guard). In theMatchBodypart, exceptions are implicitly caught and the call results in the atom'EXIT'.
Functions Allowed Only for Tracing
The functions allowed only for tracing work as follows:
is_seq_trace- Returnstrueif a sequential trace token is set for the current process, otherwisefalse.set_seq_token- Works asseq_trace:set_token/2, but returnstrueon success, and'EXIT'on error or bad argument. Only allowed in theMatchBodypart and only allowed when tracing.get_seq_token- Same asseq_trace:get_token/0and only allowed in theMatchBodypart when tracing.message- Sets an additional message appended to the trace message sent. One can only set one additional message in the body. Later calls replace the appended message.As a special case,
{message, false}disables sending of trace messages ('call' and 'return_to') for this function call, just like if the match specification had not matched. This can be useful if only the side effects of theMatchBodypart are desired.Another special case is
{message, true}, which sets the default behavior, as if the function had no match specification; trace message is sent with no extra information (if no other calls tomessageare placed before{message, true}, it is in fact a "noop").Takes one argument: the message. Returns
trueand can only be used in theMatchBodypart and when tracing.return_trace- Causes areturn_fromtrace message to be sent upon return from the current function. Takes no arguments, returnstrueand can only be used in theMatchBodypart when tracing. If the process trace flagsilentis active, thereturn_fromtrace message is inhibited.Warning: If the traced function is tail-recursive, this match specification function destroys that property. Hence, if a match specification executing this function is used on a perpetual server process, it can only be active for a limited period of time, or the emulator will eventually use all memory in the host machine and crash. If this match specification function is inhibited using process trace flag
silent, tail-recursiveness still remains.exception_trace- Works asreturn_traceplus; if the traced function exits because of an exception, anexception_fromtrace message is generated, regardless of the exception is caught or not.process_dump- Returns some textual information about the current process as a binary. Takes no arguments and is only allowed in theMatchBodypart when tracing.enable_trace- With one parameter this function turns on tracing like the Erlang callerlang:trace(self(), true, [P2]), whereP2is the parameter toenable_trace.With two parameters, the first parameter is to be either a process identifier or the registered name of a process. In this case tracing is turned on for the designated process in the same way as in the Erlang call
erlang:trace(P1, true, [P2]), whereP1is the first andP2is the second argument. The processP1gets its trace messages sent to the same tracer as the process executing the statement uses.P1cannot be one of the atomsall,neworexisting(unless they are registered names).P2cannot becpu_timestamportracer.Returns
trueand can only be used in theMatchBodypart when tracing.disable_trace- With one parameter this function disables tracing like the Erlang callerlang:trace(self(), false, [P2]), whereP2is the parameter todisable_trace.With two parameters this function works as the Erlang call
erlang:trace(P1, false, [P2]), whereP1can be either a process identifier or a registered name and is specified as the first argument to the match specification function.P2cannot becpu_timestamportracer.Returns
trueand can only be used in theMatchBodypart when tracing.trace- With two parameters this function takes a list of trace flags to disable as first parameter and a list of trace flags to enable as second parameter. Logically, the disable list is applied first, but effectively all changes are applied atomically. The trace flags are the same as forerlang:trace/3, not includingcpu_timestamp, but includingtracer.If a tracer is specified in both lists, the tracer in the enable list takes precedence. If no tracer is specified, the same tracer as the process executing the match specification is used (not the meta tracer). If that process doesn't have tracer either, then trace flags are ignored.
When using a tracer module, the module must be loaded before the match specification is executed. If it is not loaded, the match fails.
With three parameters to this function, the first is either a process identifier or the registered name of a process to set trace flags on, the second is the disable list, and the third is the enable list.
Returns
trueif any trace property was changed for the trace target process, otherwisefalse. Can only be used in theMatchBodypart when tracing.caller- Returns the calling function as a tuple{Module, Function, Arity}or the atomundefinedif the calling function cannot be determined. Can only be used in theMatchBodypart when tracing.Notice that if a "technically built in function" (that is, a function not written in Erlang) is traced, the
callerfunction sometimes returns the atomundefined. The calling Erlang function is not available during such calls.caller_line- Similar tocallerbut returns additional information about the source code location of the function call-site within the caller function. Returns the calling function as a tuple{Module, Function, Arity, {File, Line}}.Fileis the string file name whileLineis source line number. If theFileandLinecannot be determined,{Module, Function, Arity, undefined}is returned. If the calling function cannot be determined, the atomundefinedis returned. Can only be used in theMatchBodypart when tracing.Notice that if a "technically built in function" (that is, a function not written in Erlang) is traced, the
caller_linefunction sometimes returns the atomundefined. The calling Erlang function is not available during such calls.current_stacktrace- Returns the current call stack back-trace (stacktrace) of the calling function. The stack has the same format as in thecatchpart of atry. See The call-stack back trace (stacktrace). The depth of the stacktrace is truncated according to thebacktrace_depthsystem flag setting.Accepts a depth parameter. The depth value will be
backtrace_depthif the argument is greater.display- For debugging purposes only. Displays the single argument as an Erlang term onstdout, which is seldom what is wanted. Returnstrueand can only be used in theMatchBodypart when tracing.get_tcw- Takes no argument and returns the value of the node's trace control word. The same is done byerlang:system_info(trace_control_word).The trace control word is a 32-bit unsigned integer intended for generic trace control. The trace control word can be tested and set both from within trace match specifications and with BIFs. This call is only allowed when tracing.
set_tcw- Takes one unsigned integer argument, sets the value of the node's trace control word to the value of the argument, and returns the previous value. The same is done byerlang:system_flag(trace_control_word, Value). It is only allowed to useset_tcwin theMatchBodypart when tracing.silent- Takes one argument. If the argument istrue, the call trace message mode for the current process is set to silent for this call and all later calls, that is, call trace messages are inhibited even if{message, true}is called in theMatchBodypart for a traced function.This mode can also be activated with flag
silenttoerlang:trace/3.If the argument is
false, the call trace message mode for the current process is set to normal (non-silent) for this call and all later calls.If the argument is not
trueorfalse, the call trace message mode is unaffected.
Note
All "function calls" must be tuples, even if they take no arguments. The value of
selfis the atom()self, but the value of{self}is the pid() of the current process.
Match target
Each execution of a match specification is done against a match target term. The format and content of the target term depends on the context in which the match is done. The match target for ETS is always a full table tuple. The match target for call trace is always a list of all function arguments. The match target for event trace depends on the event type, see table below.
| Context | Type | Match target | Description |
|---|---|---|---|
| ETS | {Key, Value1, Value2, ...} | A table object | |
| Trace | call | [Arg1, Arg2, ...] | Function arguments |
| Trace | send | [Receiver, Message] | Receiving process/port and message term |
| Trace | 'receive' | [Node, Sender, Message] | Sending node, process/port and message term |
Table: Match target depending on context
Variables and Literals
Variables take the form '$<number>', where <number> is an integer between 0
and 100,000,000 (1e+8). The behavior if the number is outside these limits is
undefined. In the MatchHead part, the special variable '_' matches
anything, and never gets bound (like _ in Erlang).
- In the
MatchCondition/MatchBodyparts, no unbound variables are allowed, so'_'is interpreted as itself (an atom). Variables can only be bound in theMatchHeadpart. - In the
MatchBodyandMatchConditionparts, only variables bound previously can be used. - As a special case, the following apply in the
MatchCondition/MatchBodyparts:- The variable
'$_'expands to the whole match target term. - The variable
'$$'expands to a list of the values of all bound variables in order (that is,['$1','$2', ...]).
- The variable
In the MatchHead part, all literals (except the variables above) are
interpreted "as is".
In the MatchCondition/MatchBody parts, the interpretation is in some ways
different. Literals in these parts can either be written "as is", which works
for all literals except tuples, or by using the special form {const, T}, where
T is any Erlang term.
For tuple literals in the match specification, double tuple parentheses can also
be used, that is, construct them as a tuple of arity one containing a single
tuple, which is the one to be constructed. The "double tuple parenthesis" syntax
is useful to construct tuples from already bound variables, like in
{{'$1', [a,b,'$2']}}. Examples:
| Expression | Variable Bindings | Result |
|---|---|---|
{{'$1','$2'}} | '$1' = a, '$2' = b | {a,b} |
{const, {'$1', '$2'}} | Irrelevant | {'$1', '$2'} |
a | Irrelevant | a |
'$1' | '$1' = [] | [] |
[{{a}}] | Irrelevant | [{a}] |
['$1'] | '$1' = [] | [[]] |
42 | Irrelevant | 42 |
"hello" | Irrelevant | "hello" |
$1 | Irrelevant | 49 (the ASCII value for character '1') |
Table: Literals in MatchCondition/MatchBody Parts of a Match Specification
Execution of the Match
The execution of the match expression, when the runtime system decides whether a trace message is to be sent, is as follows:
For each tuple in the MatchExpression list and while no match has succeeded:
- Match the
MatchHeadpart against the match target term, binding the'$<number>'variables (much like inets:match/2). If theMatchHeadpart cannot match the arguments, the match fails. - Evaluate each
MatchCondition(where only'$<number>'variables previously bound in theMatchHeadpart can occur) and expect it to return the atomtrue. When a condition does not evaluate totrue, the match fails. If any BIF call generates an exception, the match also fails. - Two cases can occur:
If the match specification is executing when tracing:
Evaluate each
ActionTermin the same way as theMatchConditions, but ignore the return values. Regardless of what happens in this part, the match has succeeded.If the match specification is executed when selecting objects from an ETS table:
Evaluate the expressions in order and return the value of the last expression (typically there is only one expression in this context).
Differences between Match Specifications in ETS and Tracing
ETS match specifications produce a return value. Usually the MatchBody
contains one single ConditionExpression that defines the return value without
any side effects. Calls with side effects are not allowed in the ETS context.
When tracing there is no return value to produce, the match specification either
matches or does not. The effect when the expression matches is a trace message
rather than a returned term. The ActionTerms are executed as in an imperative
language, that is, for their side effects. Functions with side effects are also
allowed when tracing.
Tracing Examples
Match an argument list of three, where the first and third arguments are equal:
[{['$1', '_', '$1'],
[],
[]}]Match an argument list of three, where the second argument is a number > 3:
[{['_', '$1', '_'],
[{ '>', '$1', 3}],
[]}]Match an argument list of three, where the third argument is either a tuple
containing argument one and two, or a list beginning with argument one and two
(that is, [a,b,[a,b,c]] or [a,b,{a,b}]):
[{['$1', '$2', '$3'],
[{'orelse',
{'=:=', '$3', {{'$1','$2'}}},
{'and',
{'=:=', '$1', {hd, '$3'}},
{'=:=', '$2', {hd, {tl, '$3'}}}}}],
[]}]The above problem can also be solved as follows:
[{['$1', '$2', {'$1', '$2}], [], []},
{['$1', '$2', ['$1', '$2' | '_']], [], []}]Match two arguments, where the first is a tuple beginning with a list that in
turn begins with the second argument times two (that is, [{[4,x],y},2] or
[{[8], y, z},4]):
[{['$1', '$2'],[{'=:=', {'*', 2, '$2'}, {hd, {element, 1, '$1'}}}],
[]}]Match three arguments. When all three are equal and are numbers, append the process dump to the trace message, otherwise let the trace message be "as is", but set the sequential trace token label to 4711:
[{['$1', '$1', '$1'],
[{is_number, '$1'}],
[{message, {process_dump}}]},
{'_', [], [{set_seq_token, label, 4711}]}]As can be noted above, the parameter list can be matched against a single
MatchVariable or an '_'. To replace the whole parameter list with a single
variable is a special case. In all other cases the MatchHead must be a
proper list.
Generate a trace message only if the trace control word is set to 1:
[{'_',
[{'==',{get_tcw},{const, 1}}],
[]}]Generate a trace message only if there is a seq_trace token:
[{'_',
[{'==',{is_seq_trace},{const, 1}}],
[]}]Remove the 'silent' trace flag when the first argument is 'verbose', and add
it when it is 'silent':
[{'$1',
[{'==',{hd, '$1'},verbose}],
[{trace, [silent],[]}]},
{'$1',
[{'==',{hd, '$1'},silent}],
[{trace, [],[silent]}]}]Add a return_trace message if the function is of arity 3:
[{'$1',
[{'==',{length, '$1'},3}],
[{return_trace}]},
{'_',[],[]}]Generate a trace message only if the function is of arity 3 and the first
argument is 'trace':
[{['trace','$2','$3'],
[],
[]},
{'_',[],[]}]ETS Examples
Match all objects in an ETS table, where the first element is the atom
'strider' and the tuple arity is 3, and return the whole object:
[{{strider,'_','_'},
[],
['$_']}]Match all objects in an ETS table with arity > 1 and the first element is 'gandalf', and return element 2:
[{'$1',
[{'==', gandalf, {element, 1, '$1'}},{'>=',{size, '$1'},2}],
[{element,2,'$1'}]}]In this example, if the first element had been the key, it is much more
efficient to match that key in the MatchHead part than in the
MatchConditions part. The search space of the tables is restricted with
regards to the MatchHead so that only objects with the matching key are
searched.
Match tuples of three elements, where the second element is either 'merry' or
'pippin', and return the whole objects:
[{{'_',merry,'_'},
[],
['$_']},
{{'_',pippin,'_'},
[],
['$_']}]Function ets:test_ms/2 can be useful for testing complicated ETS matches.