Other articles

  1. OCaml LLVM bindings tutorial, part 3

    See also:

    The previous articles explain how to build applications using the OCaml-LLVM bindings, and how to use the API to manipulate the LLVM objects. This was the “read-only” part of the tutorial, which can be used to analyze LLVM IR.

    This part explains how to create LLVM IR, and write a simple application from scratch, and see how to build and run it.


    As in the previous tutorial, we need to create a context and a module:

    let llctx = global_context () in
    let llm = create_module llctx "mymodule" in


    There are two actions that can be done on functions:

    • declare_function to give only a declaration of the prototype,
    • define_function to give both the declaration and the implementation.

    In both cases, we need to give the signature (return type, number and type of arguments) of the function.

    This is pretty similar to C. We’ll use this to declare the function int main(void).

    The int type is a bit problematic in LLVM (and in C, but for other reasons): integer types must have a known size in LLVM. While this does not change the architecture-independent property ...

    read more
  2. OCaml LLVM bindings tutorial, part 2

    See also:

    In the previous tutorial, we’ve seen how to use ocamlbuild and make to build a simple application. In this part, we’ll start exploring the API, and see how to access values and attributes of LLVM objects.

    The base of the code is the same as in part 1: it reads an existing LLVM bitcode file, for example one generated by clang.

    As in previous tutorial part, knowing the LLVM C++ API is not required (but can help).

    LLVM objects

    The top-level container is a module (llmodule). The module contains global variables, types and functions, which in turn contains basic blocks, and basic blocks contain instructions.


    In the OCaml bindings, all objects (variables, functions, instructions) are instances of the opaque type llvalue.

    A value has a type, a name, a definition, a list of users, and other things like attributes (for ex. visibility or linkage options) or aliases.

    Each value has a type (lltype), which is a composite object to define the type of a value and its arguments. To match the real type, it needs to be converted to a TypeKind.t:

    let rec print_type llty =
      let ty = Llvm ...
    read more
  3. OCaml LLVM bindings tutorial, part 1



    This is the first part of a tutorial series, on how to use the OCaml bindings for LLVM. Why use OCaml bindings ? Because you can avoid using the C++ API, spending huge amounts of time compiling Clang sources, then your plugin, then debugging the segfaults again and again. The bindings are stable, cover most of the API, and are quite simple to use, thanks to the Debian packages.

    This tutorial is written based on a Debian Sid, things may differ but should stay similar on other distributions.

    The objectives of this first part are:

    • install the required packages
    • setup a build environment for ocamlbuild
    • build a simple application that reads an LLVM bitcode file and prints it


    The required packages are:

    • llvm-3.5-dev
    • libllvm-3.5-ocaml-dev
    • the LLVM and OCaml compilers (llvm-3.5, ocaml)
    • optionally, clang

    The current LLVM version is 3.6, however the OCaml bindings are currently disabled (See Debian bug #783919), because of changes in the required dependencies.

    Project Layout

    The sources are organized as follows:

    ├── build
    ├── Makefile
    └── src
        └── tutorial01.ml

    First application

    First, create file src/tutorial01.ml:

    let _ =
      let llctx = Llvm.global_context () in
      let llmem = Llvm.MemoryBuffer.of_file Sys.argv.(1) in
      let ...
    read more
  4. Materials for my talk at SSTIC 2015 - PICON : Control Flow Integrity on LLVM IR

    Here are the materials for the talk PICON : Control Flow Integrity on LLVM IR, given during SSTIC 2015. While SSTIC is a french-speaking conference, I publish here in English because my other posts also are in English.

    Here is the summary, from the website:

    Control flow integrity has been a well explored field of software security for more than a decade.

    However, most of the proposed approaches are stalled in a proof of concept state - when the implementation is publicly available - or have been designed with a minimal performance overhead as their primary objective, sacrificing security.

    Currently, none of the proposed approaches can be used to fully protect real-world programs compiled with most common compilers (e.g. GCC, Clang/LLVM).

    In this paper we describe a control flow integrity enforcement mechanism whose main objective is security. Our approach is based on compile-time code instrumentation, making the program communicate with its external execution monitor. The program is terminated by the monitor as soon as a control flow integrity violation is detected.

    Our approach is implemented as an LLVM plugin and is working on LLVM’s Intermediate Representation.

    Code is currently being published (with an opensource ...

    read more
  5. Python scripts in GDB

    Since version 7.0, gdb has gained the ability to execute Python scripts. This allows to write gdb extensions, commands, or manipulate data in a very easy way. It can also allow to manipulate graphic data (by spawning commands in threads), change the program, or even write a firewall (ahem ..). I’ll assume you’re familiar with both gdb commands and basic Python scripts.

    The first and very basic test is to check a simple command

    (gdb) python print "Hello, world !"
    Hello, world !

    So far so good. Yet, printing hello world won’t help us to debug our programs :)

    The reference documentation can be found here, but does not really help for really manipulating data. I’ll try to give a few examples here.

    The Python script

    The first thing to do is to write a script (we’ll call it gdb-wzdftpd.py) containing the Python commands.

    We will define a command to print the Glib’s type GList, with nodes and content (which is stored using a void*).

    To define a new command, we have to create a new class inherited from gdb.Command. This class has two mandatory methods, __init__ and invoke.

    Gdb redirects stdout and stderr to ...

    read more
  6. animated charts in python and Qt

    I’m currently trying to generated interactive (and animated) charts in Python + Qt. The wanted library would be:

    • portable: this is one of the reasons of the choice of PyQt
    • simple: same reason
    • interactive: I want to be able to select, for example, the slices of a pie chart. A signal of events like Qt’s would be perfect
    • animated: this is useless, but looking at things like AnyChart or FusionCharts, the result is really nice !
    • light on dependencies: relying on tons of libs makes the project hard to maintain and not portable, especially for windows where there is not packaging and dependency system.
    • free software

    A quick search gave me the following products:

    • matplotlib: mostly for scientific plots, but there is a nice number of options, a well-documented API.
    • pyQwt: Python bindings for Qwt. Again, it’s more scientific plot than charts
    • cairoplot: projects looks dead (or in the "yeah, the project’s not finished, but we’re recoding it in \$LANG to be faster" syndrome, which is more or less the same). It generates images, though item maps can be extracted. The name tells it, it uses Cairo.
    • pyCha: some nice charts, uses Cairo. Very simple API (not ...
    read more
  7. libnetfilter-{queue,log} bindings release

    I just released nfqueue-bindings 0.2 and nflog-bindings 0.1. Despite the difference of versions, functions are almost the same :)

    Here is a short diff since previous version:

    Add af_family argument to bind operations (allow IPv6 binds)
    Add notes on set_queue_maxlen requiring a kernel >= 2.6.20
    bugfix: use queue number when creating queue
    bugfix: really link Perl binding to Perl library 
    Fix cmake warning

    Get them on nfqueue-bindings and nflog-bindings.

    read more
  8. Git rocks

    No news here, this post is mostly a note for myself, to remember some commands for git:

    Creating a repository to be shared between several hosts (with an existing project)

    On the server:

    mkdir project.git
    cd project.git
    git --bare init

    On the remote host:

    cd project
    git init
    git remote add origin ssh://server/var/git/project
    git config branch.master.remote origin
    git config branch.master.merge refs/heads/master

    Now you can make the first commit:

    git add .
    git commit -m "First commit"
    git push
    Fix a mistake in a previous commit
    1. Save your work so far.
    2. Stash your changes away for now: git stash
    3. Now your working copy is clean at the state of your last commit.
    4. Use ‘git rebase -i’, and use the ‘edit’ command on the commit you want to edit
    5. Make the fixes. (If you just want to change the log, skip this step.)
    6. Commit the changes in “amend” mode: git commit —all —amend
    7. Your editor will come up asking for a log message (by default, the old log message). Save and quit the editor when you’re happy with it.
    8. The new changes are added on to the old commit. See ...
    read more
  9. Sections and variables initialization

    Default init

    ANSI C requires all uninitialized static and global variables to be initialized with 0 (§6.7.8 of the C99 definition). This means you can rely on the following behavior:

    int global;
    void function() {

    This will print 0, and it is guaranteed by the standard.

    However, this is not handled by the compiler. All you will be able to see is that the variable is put in the bss section:

    08049560 l     O .bss   00000004              static_var.1279
    08049564 g     O .bss   00000004              global_var

    It is the startup code of the linker which initializes the variables.

    The C compiler usually puts variables that are supposed to be initialized with 0 in the .bss section instead of the .data section. Opposed to the .data section, the .bss section does not contain actual data, it just specifies the size of all elements it contains. The C compiler just *assumes* that the linker, loader, or the startup code of the C library initializes this block of memory with 0. This is an optimization; .data elements occupy space in the image (or ROM or flash memory) and in RAM whereas .bss elements need to occupy RAM space only if ...

    read more

Page 1 / 2 »