Expression

Metasm uses this class to represent arbitrary symbolic arithmetic expressions, e.g.

These expressions can include Integers, Symbols, and Strings.

The symbols and strings represent arbitrary variables, with the convention that strings represent fixed quantities (eg addresses, labels), whereas symbols represent more variable stuff (eg register values).

There is also a special symbol that may be used, :unknown, to represent a value that is known to be unknown. See the reduce section.

See also core/Indirection.txt.

The Expression class holds all methods relative to Integer binary manipulation, that is encoding and decoding from/to a binary blob (see also core/EncodedData.txt)

Members

Expressions hold exactly 3 members:

lexpr and rexpr can be any value, most often String, Symbol, Integer or Expression. For unary operators, lexpr is nil.

op is a Symbol representing the operation. It should be from the list:

Instantiation

In ruby code, use the class method []. It takes 1 to 3 arguments, lexpr, op, and rexpr. lexpr defaults to nil, and op defaults to :+ (except for negative numeric values, which is stored with op == :- and rexpr == abs).

If lexpr or rexpr are an Array, the [] constructor is called recursively, to ease the definition of nested Expressions.

Exemples:

  Expression[42]
  Expression[:eax, :+, 12]
  Expression[:-, 'my_var']
  Expression[[:eax, :-, 4], :*, [:ebx, :+, 0x12]]

The Expression class also includes a parser, to allow creating an expression from a string. parse_string! will create an Expression and update its argument to point after the last part read successfully into the expr. The parser handles standard C operator precedence.

  str = "1 + var"
  Expression.parse_string!(str)    # => Expression[1, :+, "var"]
  str = "42 bla"
  Expression.parse_string!(str)    # => Expression[42]
  str                             # => "bla"

Use parse_string without the ! to parse the string without updating it.

External variables

The externals method will return all non-integer members of the Expression.

  Expression[[:eax, :+, 42], :-, "bla"].externals  # => [:eax, "bla"]

Pattern matching

The match method allows to check an Expression against a pattern without having to check individual members. The pattern should be an Expression, whose variable members should be Strings or Symbols, which are also passed as arguments to the match function. On successful match, the correspondance between variable patterns and their actual value matched is returned as a Hash.

  Expression[1, :+, 2].match(Expression['var', :+, 2], 'var')
  # => { 'var' => 1 }
  Expression[1, :+, 2].match(Expression['var', :+, 'var'], 'var')
  # => nil
  Expression[1, :+, 1].match(Expression['var', :op, 'var'], 'var', :op)
  # => { 'var' => 1, :op => :+ }

Reduction

Metasm Expressions include a basic symbolic computation engine, that allows some simple transformations of the Expression. The reduction will also compute numerical values whenever possible. If the final result is fully numeric, an Integer is returned, otherwise a new Expression is returned.

In this context, the special value :unknown has a particular meaning.

  Expression[1, :+, 2].reduce
  # => 3
  Expression[:eax, :+, [:ebx, :-, :eax]].reduce
  # => Expression[:ebx]
  Expression[1, :+, [:eax, :+, 2]].reduce
  # => Expression[:eax, :+, 3]
  Expression[:unknown, :+, :eax].reduce
  # => Expression[:unknown]

The symbolic engine operates mostly on addition/substractions, and no-operations (eg shift by 0). It also handles some boolean composition.

The detail can be found in the #replace_rec method body, in metasm/main.rb.

The reduce method can also take a block argument, which will be called at every step in the recursive reduction, for custom operations. If the block returns nil, the result is unchanged, otherwise the new value is used as replacement. For exemple, if you operate on 32-bit values and want to get rid of bla & 0xffffffff, use

  some_expr.reduce { |e|
    if e.kind_of?(Expression) and e.op == :& and e.rexpr == 0xffff_ffff
      e.lexpr
    end
  }

Binding

An expression involving variable externals can be bound using a Hash. This will replace any occurence of a key of the Hash by its value in the expression members. The bind method will return a new Expression with the substitutions, and the bind! method will update the Expression in-place.

  Expression['val', :+, 'stuff'].bind('val' => 4, 'stuff' => 8).reduce
  # => 12
  Expression[:eax, :+, :ebx].bind(:ebx => 42)
  # Expression[:eax, :+, 42]
  Expression[:eax, :+, :ebx].bind(:ebx => :ecx)
  # Expression[:eax, :+, :ecx]

You can use Expressions as keys, but they will only be used on perfect matches.

Binary packing

Encoding

The encode method will generate an EncodedData holding the expression, either as binary if it can reduce to an integral value, or as a relocation. The arguments are the relocation type and the endianness, plus an optional backtrace (to notify the user where an overflowing relocation comes from).

The encode_imm class method will generate a raw String for a given integral value, a type and an endianness. The type can be given as a byte size.

  Expression.encode_imm(42, :u8, :little)  # => "*"
  Expression.encode_imm(42, 1, :big)       # => "*"
  Expression.encode_imm(256, :u8, :little) # raise EncodeError

On overflows (value cannot be encoded in the bit field) an EncodeError exception is raised.

Decoding

The decode_imm class method can be used to read a binary value into an Integer, with an optional offset into the binary string.

  Expression.decode_imm("*", :u8, :little)               # => 42
  Expression.decode_imm("bla\xfe\xff", :i16, :little, 3) # => -2

Arithmetic coercion

Expression implement the :+ and :- ruby methods, so that expr + 4 works as expected. The result is reduced.

Integer methods

The Expression class offers a few methods to work with integers.

make_signed

make_signed will convert a raw unsigned to its equivalent signed value, given a bit size.

  Expression.make_signed(1, 16)      # => 1
  Expression.make_signed(0xffff, 16) # => -1

in_range?

in_range? can check if a given numeric value would fit in a particular core/Relocation.txt field. The method can return true or false if it fits or not, or nil if the result is unknown (eg the expr has no numeric value).

  Expression.in_range?(42, :i8)                 # => true
  Expression.in_range?(128, :i8)                # => false
  Expression.in_range?(-128, :i8)               # => true
  Expression.in_range?(Expression['bla'], :u32) # => nil