Sunday, May 20, 2012

Graphcuts for Python: pygco (slight update)

I have been using the excellent gco library for energy minimization with graph cuts for quite some time. Finally I got around to clean up / rewrite some of my Python wrappers so that maybe someone else can use them, too.


So what does this library do?
It does discrete energy minimization on loopy graphs. This is an important topic in computer vision, since it can be used for segmentation, stereo vision and optical flows. When you read "CRF" in a computer vision paper and there is no learning involved, it just means they minimized an energy function (probably using gco). Often this is just some form of "smoothing" but a lot more can be done with it.

Let's talk about pairwise energy functions. These functions take the form

E = \sum v_i(y_i) + \sum w_ij(y_i, w_j) 


with v_i and w_ij are some fixed parameters and y_i are the multinomial variables over which we want to minimize this energy. Typically the i run over all pixels in an image, and w_ij is only nonzero for adjacent pixels.
In general, think of v and w as tables (of size n_labels or n_labels x n_labels) for each node resp. edge in the graph. In simple models, these are the same tables for all nodes / edges, in more complicated ones they vary.

In general this is NP hard, but there are many interesting cases, in which this problem can (somewhat surprisingly) be solved efficiently and exactly.
If y_i are binary and the energy is submodular, the exact solution can be found efficiently.

If the y_i are not binary, but the energy fulfills a related property, it is possible to efficiently find good, if not optimal, solutions. See the seminal paper of Yuri Boykov, Olga Veksler and Ramin Zabih Fast approximate energy minimisation via graph cuts.

This is where gco comes in: it implements two move-making algorithms, alpha-expansion and alpha-beta-swaps, that solve a series of binary problems to obtain a solution to a multi-label problem.

Going back to the applications I mentioned above, these labels could be different depths in stereo vision, different directions of movements in optical flow, or different object classes in semantic segmentation.

There are several ways to specify the energy for gco and I don't want to go into to much detail here. Have a look at the README if you are interested.

For the moment, I have ported two ways to Python (but the rest should come soon). One is for general graphs, specified by listing all edges, the other one is for 2D grid graphs (images).
In all cases, the unary potentials (the v_i above) are just given as one number for each vertex (pixel in the grid graph case) and state.
In the simple case that I wrapped for Python, the pairwise potentials (the w_ij)
are fixed over the whole graph and only depend on the combination of labels y_i and y_j.
This contains the case of the simple Pott's potential, where w_ij is one if y_i = y_j and zero otherwise.

If you want to use the code, you need to download the original gco library and my wrappers. There is one thing you have to keep in mind when working with gco, though: all potentials are expected to be integers, so you need to round them!

Here are some examples (available at my git repo):
This is a trivial example of smoothing. We have a binary image, add some noise and want to recover the original image. The example contains a "grid graph" version and a version where I explicitly construct the edges.


The same works with multinomial input. Here I used two different kind of pairwise potentials, the second from the right used Pott's potentials, while the one on the very right knows that blue-green and green-red edges are more likely than blue-red.

And finally, a not-as-trivial-looking example:


The first two images on the top are consecutive frames from a video sequence from the middlebury benchmark. This example tries to find depth layers in the image by using paralax.
The second image is translated in x direction and the best fit for each pixel is found. The image on the top right gives the best translation for each individual pixel. The bottom images show the results obtained with alpha-expansion and Pott's potentials (right) and a 1d topology (left).

This is not really how you would do stereo vision (the potentials I used are very naive) but I feel it is a nice example (about 10 lines in Python).

I hope some people find this helpful an there will be more computer vision code in Python soon :)

33 comments:

  1. Hi,
    I failed to install pycgo on my machine.
    It fails on python setup.py build step, it can't compile gco library properly.
    any hints ?

    ReplyDelete
  2. Hi Kostia.
    In writing the wrapper I was a bit lazy. You have to compile gco yourself. Unfortunately it doesn't come with a makefile iirc.
    I'm not on my own box at the moment so I can't look up what I used but I think it should be a single gcc line (something along the lines of gcc --shared GCOptimization.cpp -O2 -o gco.so).

    ReplyDelete
    Replies
    1. Hi Andreas,
      I tried this, also I tried smth like this

      gcc -fPIC -c GCoptimization.cpp \
      LinkedBlockList.cpp \
      graph.cpp \
      maxflow.cpp

      g++ -shared GCoptimization.o maxflow.o .o LinkedBlockList.o graph.o -o gco.so

      but still it didn't compile for me...

      Delete
    2. What did not compile? gco.so or the Cython part?
      And what error message did you get?

      Delete
    3. gco.so didn't compile...
      I got a lot of them...
      for instance,
      the beginning of it is

      GCoptimization.h:389:16: error: ‘ptrdiff_t’ does not name a type
      GCoptimization.h:414:16: error: ‘ptrdiff_t’ does not name a type
      GCoptimization.cpp:31:23: warning: ‘GCO_CLOCKS_PER_SEC’ initialized and declared ‘extern’ [enabled by default]
      GCoptimization.cpp: In member function ‘GCoptimization::SiteID GCoptimization::GreedyIter::feasibleSites() const’:
      GCoptimization.cpp:555:73: error: no match for ‘operator-’ in ‘((const GCoptimization::GreedyIter*)this)->GCoptimization::GreedyIter::m_siteend - ((const GCoptimization::GreedyIter*)this)->GCoptimization::GreedyIter::m_site’
      GCoptimization.cpp: In member function ‘GCoptimization::EnergyTermType GCoptimization::DataCostFnSparse::search(GCoptimization::DataCostFnSparse::DataCostBucket&, GCoptimization::SiteID)’:
      GCoptimization.cpp:1823:19: error: ‘cLinearSearchSize’ was not declared in this scope
      In file included from GCoptimization.h:108:0,
      from GCoptimization.cpp:4:
      energy.h: In instantiation of ‘int Energy::get_var(Energy::Var) [with captype = int; tcaptype = int; flowtype = long long int; Energy::Var = int]’:
      GCoptimization.cpp:197:20: required from here
      energy.h:328:91: error: ‘what_segment’ was not declared in this scope, and no declarations were found by argument-dependent lookup at the point of instantiation [-fpermissive]
      energy.h:328:91: note: declarations in dependent base ‘Graph’ are not found by unqualified lookup
      energy.h:328:91: note: use ‘this->what_segment’ instead
      energy.h: In instantiation of ‘void Energy::add_term1(Energy::Var, Energy::Value, Energy::Value) [with captype = int; tcaptype = int; flowtype = long long int; Energy::Var = int; Energy::Value = int]’:
      GCoptimization.cpp:280:22: required from here
      energy.h:207:2: error: ‘add_tweights’ was not declared in this scope, and no declarations were found by argument-dependent lookup at the point of instantiation [-fpermissive]
      energy.h:207:2: note: declarations in dependent base ‘Graph’ are not found by unqualified lookup
      energy.h:207:2: note: use ‘this->add_tweights’ instead
      energy.h: In instantiation of ‘void Energy::add_term2(Energy::Var, Energy::Var, Energy::Value, Energy::Value, Energy::Value, Energy::Value) [with captype = int; tcaptype = int; flowtype = long long int; Energy::Var = int; Energy::Value = int]’:
      GCoptimization.cpp:304:42: required from here
      energy.h:220:2: error: ‘add_tweights’ was not declared in this scope, and no declarations were found by argument-dependent lookup at the point of instantiation [-fpermissive]
      energy.h:220:2: note: declarations in dependent base ‘Graph’ are not found by unqualified lookup
      energy.h:220:2: note: use ‘this->add_tweights’ instead
      energy.h:235:3: error: ‘add_tweights’ was not declared in this scope, and no declarations were found by argument-dependent lookup at the point of instantiation [-fpermissive]
      energy.h:235:3: note: declarations in dependent base ‘Graph’ are not found by unqualified lookup
      energy.h:235:3: note: use ‘this->add_tweights’ instead
      energy.h:236:3: error: ‘add_tweights’ was not declared in this scope, and no declarations were found by argument-dependent lookup at the point of instantiation [-fpermissive]
      energy.h:236:3: note: declarations in dependent base ‘Graph’ are not found by unqualified lookup

      Delete
    4. Which of the two commands did that? The second should just link, right? And did you try the first with g++ ?

      Delete
    5. it did the first command... Yes, I tried...
      I also googled for it, but without success.
      I didn't use GCO before. May be It couldn't be compiled with gcc/g++ ?
      Did you compile it under linux ?

      Delete
    6. This comment has been removed by the author.

      Delete
    7. I found that in order to compile gco with a recent version of gcc, you need to include stddef.h in GCoptimization.h, then it's just a matter of compiling and linking into a shared object.

      Delete
  3. This comment has been removed by the author.

    ReplyDelete
  4. The problem was because of my gcc 4.7. I managed to compile it, but didn't tested a lot, but at first glance it works.

    ReplyDelete
    Replies
    1. I ran into the same problem with gcc 4.7. I was able to compile by editing energy.h and adding "this->" in front of the following functions: what_segment(), add_edge(), add_tweights()

      Delete
    2. Kostia Could you please let me know how you solved the compilation problem. I am using 4.4.6. Thanks. I am new to the Linux.

      Anonymous, I tried putting this in front of the functions. Still I am facing compilation problem. Could you please paste your modified code

      Delete
    3. Andreas,

      Could you please let me know how GCO will help in segmenting the image and assigning labels. When I am looking at the code no where Image is being taken as input. If so how it will segment the image. Please let me know. Thanks.

      Delete
  5. This comment has been removed by the author.

    ReplyDelete
  6. A cross-platform solution for compiling gco, just use the following CMakeLists.txt

    cmake_minimum_required (VERSION 2.6)

    PROJECT(gco)

    SET( gcoSources
    GCoptimization.h GCoptimization.cpp
    graph.h graph.cpp
    LinkedBlockList.h LinkedBlockList.cpp
    energy.h
    block.h
    maxflow.cpp )

    SET( LIBRARY_OUTPUT_PATH ${CMAKE_CURRENT_BINARY_DIR}/lib CACHE PATH
    "Output for libraries" )

    ADD_LIBRARY(gco SHARED ${gcoSources})

    I also had to follow the instructions at

    http://web.archive.org/web/20101221084639/http://nukeit.org/compile-python-2-7-packages-with-visual-studio-2010-express/

    to get distutils to work with windows otherwise very easy. Thanks for the blog!

    ReplyDelete
  7. Thanks :) I was a bit to lazy to do this. If you want, a pull request for github with some instructions would be very welcome.

    ReplyDelete
  8. Hey Andreas,

    Thanks for the wrapper. I added edge weights as well which was easy, so though of sharing it with you. It uses the setSmoothCostVH function.

    Thanks again, Amir

    #############
    def cut_simple_vh(np.ndarray[np.int32_t, ndim=3, mode='c'] unary_cost,
    np.ndarray[np.int32_t, ndim=2, mode='c'] pairwise_cost,
    np.ndarray[np.int32_t, ndim=2, mode='c'] costV,
    np.ndarray[np.int32_t, ndim=2, mode='c'] costH,
    n_iter=5,
    algorithm='expansion'):
    """
    Apply multi-label graphcuts to grid graph.

    Parameters
    ----------
    unary_cost: ndarray, int32, shape=(width, height, n_labels)
    Unary potentials
    pairwise_cost: ndarray, int32, shape=(n_labels, n_labels)
    Pairwise potentials for label compatibility
    costV: ndarray, int32, shape=(width, height)
    Vertical edge weights
    costH: ndarray, int32, shape=(width, height)
    Horizontal edge weights
    n_iter: int, (default=5)
    Number of iterations
    algorithm: string, `expansion` or `swap`, default=expansion
    Whether to perform alpha-expansion or alpha-beta-swaps.
    """

    if unary_cost.shape[2] != pairwise_cost.shape[0]:
    raise ValueError("unary_cost and pairwise_cost have incompatible shapes.\n"
    "unary_cost must be height x width x n_labels, pairwise_cost must be n_labels x n_labels.\n"
    "Got: unary_cost: (%d, %d, %d), pairwise_cost: (%d, %d)"
    %(unary_cost.shape[0], unary_cost.shape[1], unary_cost.shape[2],
    pairwise_cost.shape[0], pairwise_cost.shape[1]))
    if pairwise_cost.shape[1] != pairwise_cost.shape[0]:
    raise ValueError("pairwise_cost must be a square matrix.")
    cdef int h = unary_cost.shape[1]
    cdef int w = unary_cost.shape[0]
    cdef int n_labels = pairwise_cost.shape[0]
    if (pairwise_cost != pairwise_cost.T).any():
    raise ValueError("pairwise_cost must be symmetric.")
    if costV.shape[0] != w or costH.shape[0] != w or costV.shape[1] != h or costH.shape[1] != h:
    raise ValueError("incorrect costV or costH dimensions.")

    cdef GCoptimizationGridGraph* gc = new GCoptimizationGridGraph(h, w, n_labels)
    gc.setDataCost(unary_cost.data)
    gc.setSmoothCostVH(pairwise_cost.data, costV.data, costH.data)
    if algorithm == 'swap':
    gc.swap(n_iter)
    elif algorithm == 'expansion':
    gc.expansion(n_iter)
    else:
    raise ValueError("algorithm should be either `swap` or `expansion`. Got: %s" % algorithm)

    cdef np.npy_intp result_shape[2]
    result_shape[0] = w
    result_shape[1] = h
    cdef np.ndarray[np.int32_t, ndim=2] result = np.PyArray_SimpleNew(2, result_shape, np.NPY_INT32)
    cdef int * result_ptr = result.data
    for i in xrange(w * h):
    result_ptr[i] = gc.whatLabel(i)
    return result

    ReplyDelete
  9. ha, the comment section seems to screw all the indentations :(

    ReplyDelete
  10. Thanks for sharing. If you want, you can add a pull-requestion on github. Otherwise it might take a while until I have time to include it.

    ReplyDelete
  11. Dear Andreas,

    this blog is very cool. I'm interested in this gco program, but I didn't find a good example in c++, like middlebury. So ff I have a grayscale image and lets say 4 labels, then how can I run this program, to segment the image into 4 region. I know, that you interested in python, but do you know some examples in the net?
    Thanks.

    ReplyDelete
    Replies
    1. Thanks :)
      You can find code for middlebury that compares several approaches here: http://vision.middlebury.edu/MRF/
      I haven't looked at it in detail but it should be possible to reproduce the middlebury results with it.

      Delete
  12. Hi there. I'm trying to use your wrappers but after compiling and building them i'm having always the same error : "DLL load failed: The specified module could not be found". I used the depency walker tools on the pygco.pyd and says that At least one required implicit or forwarded dependency was not found: LIBSTDC++-6.DLL, LIBGCC_S_SJLJ-1.DLL. When i look it up on google most says that these are problems of compilers versions but i dont know how to resolve them. Can you help me?
    Thanks

    ReplyDelete
    Replies
    1. Sorry, I have no idea. You are building using GCC under Windows? Cygwin? Btw, maybe you should give http://hci.iwr.uni-heidelberg.de/opengm2/ a shot. They have a branch containing python wrappers: https://github.com/DerThorsten/opengm/tree/RC2.1

      Delete
  13. With windows. I'm no exepert with C++ and compilers but it worked for me one time, now i don't know whats rong. I cmpiled the files using Windows Visual studio 2012, i Build the wraper and copied the pygco.pyd file to my python interpreter directory. Isn't that what you supose to do?

    ReplyDelete
    Replies
    1. Thanks for your comment. The problem is that I can not maintain the source code of the actual implementation for license reasons. You can try to contact the people at western university that actually maintain the software to fix this. Fetching the source is complicated enough, I don't want to maintain a set of patches on their version.

      Delete
  14. I had a compilation error on Mac OS.

    gco_src/energy.h:245:3: error: use of undeclared identifier 'add_tweights'

    Here are some notes how to install gco_python on mac and resolve this problem:

    git clone https://github.com/amueller/gco_python.git
    cd gco_python
    mkdir gco_src
    cd gco_src
    curl -O http://vision.csd.uwo.ca/code/gco-v3.0.zip
    unzip gco-v3.0.zip
    curl -O https://raw.github.com/mjirik/pyseg_base/master/distr/energy.patch
    patch energy.h < energy.patch
    cd ..
    sudo python setup.py install

    https://github.com/mjirik/pyseg_base/blob/master/INSTALL

    ReplyDelete
  15. Hi Andy,

    The wrapper is awesome. Thank you for sharing.
    However, when I feed certain matrices to the cut_simple function, the programme always throw this error message.

    terminate called after throwing an instance of 'GCException'

    Do you have any idea what happened?

    Cheers.

    ReplyDelete
    Replies
    1. Hi Thanapong.
      I would assume that is because the potentials encoded by your matrix are not submodular. Have you checked that? In that case I would recommend you try pyqpbo. If you are lucky you can just "pip install pyqpbo".
      Cheers,
      Andy

      Delete
    2. Hi,

      Thank you very much. I will try the pyqpbo.

      Delete
  16. This comment has been removed by the author.

    ReplyDelete
    Replies
    1. This comment has been removed by the author.

      Delete
  17. Hello, may I have question. Some how I tried the function cut_from_graph(*args, **kwargs) but in the end it returns only 2-class instead e.g 3 or more I asked... Any advice? Thanks
    Btw, I can provide the sample code to have look, but there is no room here to upload it or put images (illustration).

    ReplyDelete