Interactive Segmentation Techniques

Jia He , Chang-Su Kim , C.-C. Jay Kuo

{, }

The research mainly focuses on two aspects:

More efficient mode of user interaction.
More efficient use of the interaction provided by users.

Motivation, Objective, and Related Works:

Objectives

Interactive segmentation involves user interaction to indicate the “objectiveness” and thus to guide an accurate segmentation.
With prior knowledge of objects (such as brightness, color, location, and size) and constraints indicated by user interaction, segmentation algorithms often generate satisfactory results.
Most interactive segmentation systems provide an iterative procedure to allow users to add control on temporary results until a satisfactory segmentation result is obtained. This application requires the system to process quickly and update the result immediately for further refinement, which in turn demands an acceptable computational complexity of interactive segmentation algorithms

Related Works

[Focus cut] 2 ways to improve Interactive Segmentation

More efficient mode of user interaction:

1. The bounding box-based:
  - "Deep grab cut for object selection", BMVC, 2017.
2. The polygon-based:
  - "Efficient interactive annotation of segmentation datasets with polygon-rnn++", CVPR, 2018.
  - "Annotating object instances with a polygon-rnn", CVPR, 2017.
  - "Fast interactive object annotation with curve-gcn", CVPR, 2019.
3. The clicks-based:
  - "Interactive full image segmentation by considering all regions jointly, CVPR, 2019.
  - "Deep interactive thin object selection", WACV, 2021.
  - "Deep extreme cut: From extreme points to object segmentation", CVPR, 2018.
4. The scribbles-based:
  - "Error-tolerant scribbles based interactive image segmentation", CVPR, 2014.
  - "Milcut: A sweeping line multiple instance learning paradigm for interactive image segmentation", CVPR, 2014.
5. Combinations:
  - "Two-in-one refinement for interactive segmentation", BMVC, 2020.
  - "Interactive object segmentation with inside-outside guidance", CVPR, 2020.

More efficient use of the interaction provided by users:

The interaction ambiguity:

1. - "Conditional diffusion for interactive segmentation", ICCV, 2021.
  - "Interactive image segmentation with latent diversity", CVPR, 2018.
  - "Multiseg: Semantically meaningful, scale-diverse segmentations from minimal user input", ICCV, 2019.

The input information:

1. - "Interactive image segmentation with first click attention", CVPR, 2020
  - "Content-aware multilevel guidance for interactive instance segmentation", CVPR, 2019.

The back-propagating:

1. - "Interactive image segmentation via back-propagating refinement scheme", CVPR, 2019
  - "F-brs: Rethinking back-propagating refinement for interactive segmentation", CVPR, 2020.

Methodology

A classic image model is to treat an image as a graph. One can build a graph based on the relations between pixels, along with prior knowledge of objects.
- The most commonly used graph model in image segmentation is the Markov random field (MRF), where image segmentation is formulated as an optimization problem that optimizes random variables, which correspond to segmentation labels, indexed by nodes in an image graph.
- With prior knowledge of objects, the maximum a posteriori (MAP) estimation method offers an efficient solution. Given an input image, this is equivalent to minimizing an energy cost function defined by the segmentation posterior, which can be solved by:
  - Graph-cut [10, 11].
  - The shortest path [12, 13].
  - Random walks [14, 15].
Another research activity has targeted at region merging and splitting with emphasis on the completion of object regions. This approach relies on the observation that each object is composed of homogeneous regions while the background contains distinct regions from objects.
- - The merging and splitting of regions can be determined by the statistical hypothesis techniques [16–18].

The goal of interactive segmentation is to obtain accurate segmentation results based on user input and control while minimizing interaction effort and time as much as possible [19, 20].
- - 19. Malmberg F (2011) Graph-based methods for interactive image segmentation. Ph.D. thesis, University West
  - 20. Shi R, Liu Z, Xue Y, Zhang X (2011) Interactive object segmentation using iterative adjustable graph cut. In: Visual communications and image processing (VCIP), IEEE, 2011, pp 1–4
To meet this goal, researchers have proposed various solutions and their improvements [18, 21–24]. Their research has focused on algorithmic efficiency and satisfactory user interaction experience.
- - 18. Ning J, Zhang L, Zhang D, Wu C (2010) Interactive image segmentation by maximal similarity based region merging. Pattern Recogn 43(2):445–456 21. Calderero F, Marques F (2010) Region merging techniques using information theory statistical measures. IEEE Trans Image Proces 19(6):1567–1586
  - 22. Couprie C, Grady L, Najman L, Talbot H (2009) Power watersheds: a new image segmentation framework extending graph cuts, random walker and optimal spanning forest. In: 2009 IEEE 12th international conference on computer vision, pp 731–738. IEEE
  - 23. Falcão A, Udupa J, Miyazawa F (2000) An ultra-fast user-steered image segmentation paradigm: live wire on the fly. IEEE Trans Med Imag 19(1):55–62
  - 24. Noma A, Graciano A, Consularo L, Bloch I (2012) Interactive image segmentation by matching attributed relational graphs. Pattern Recogn 45(3):1159–1179

Practical segmentation tools:
- The Magnetic Lasso Tool, the Magic Wand Tool, and the Quick Select Tool in the Adobe Photoshop [25].
  - 25. Collins LM (2006) Byu scientists create tool for “virtual surgery”. Deseret Morning News pp 07–31
- The Intelligent Scissors [26] and the Foreground Select Tool [27, 28] in GIMP [29].
  - 26. Mortensen EN, Barrett WA (1995) Intelligent scissors for image composition. In: Proceedings of the 22nd annual conference on Computer graphics and interactive techniques, SIGGRAPH ’95, pp. 191–198. ACM, New York (1995)
  - 27. Friedland G, Jantz K, Rojas R (2005) Siox: simple interactive object extraction in still images. In: Seventh IEEE international symposium on multimedia, p 7. IEEE
  - 28. Friedland G, Lenz T, Jantz K, Rojas R (2006) Extending the siox algorithm: alternative clustering methods, sub-pixel accurate object extraction from still images, and generic video segmentation. Free University of Berlin, Department of Computer Science, Technical report B-06-06
  - 29. Gimp G (2008) Image manipulation program. User manual, Edge-detect filters, Sobel, The GIMP Documentation Team

Metrics: Evaluations have been conducted on interactive segmentation methods, including segmentation accuracy, running time, user interaction experience, and memory requirement [2, 9, 22, 24, 30, 31].
- - 2. Grady L, Sun Y,Williams J (2006) Three interactive graph-based segmentation methods applied to cardiovascular imaging. In: Paragios N, Chen Y, Faugeras O (eds) Handbook of Mathematical Models in Computer Vision. Springer, pp. 453–469.
  - 9. McGuinness K, O’Connor N (2010) A comparative evaluation of interactive segmentation algorithms. Pattern Recogn 43(2):434–444
  - 22. Couprie C, Grady L, Najman L, Talbot H (2009) Power watersheds: a new image segmentation framework extending graph cuts, random walker and optimal spanning forest. In: 2009 IEEE 12th international conference on computer vision, pp 731–738. IEEE
  - 24. Noma A, Graciano A, Consularo L, Bloch I (2012) Interactive image segmentation by matching attributed relational graphs. Pattern Recogn 45(3):1159–1179
  - 30. Lombaert H, Sun Y, Grady L, Xu C (2005) A multilevel banded graph cuts method for fast image segmentation. In: Tenth IEEE international conference on computer vision, 2005. ICCV, vol 1, pp 259–265. IEEE
  - 31. McGuinness K, OConnor NE (2011) Toward automated evaluation of interactive segmentation. Comput Vis Image Underst 115(6):868–884

Local Boundary Refinement

In the last several sections, we introduced methods that either track object boundary or propagate user labels throughout the input image. An object can be extracted out by a looped contour or with the same label. In this section, we examine two post-processing techniques to improve the accuracy of segmentation results.
One post-processing technique is to merge a small isolated sub-region to its surrounding sub-region as proposed in [103]. For example, as shown in Fig. 3.25, region A is a small region without a label, it can be merged into its surrounding region by this post-processing technique. On the other hand, in order to segment this small region out, a user should assign a label to this region explicitly.
For more accurate segmentation of the object boundary, we can develop a pixelbased refinement scheme along the boundary [4, 5]. In Lazy Snapping, Li et al. [5] represented an object boundary using triangles and allowed users to edit the boundary by dragging and moving boundary points. It is helpful to get better boundary locations with user editing.
For complex object boundaries such as hairy, furry, motion blurred, and transparent boundaries, it is difficult to get a satisfactory hard-segmentation result even with user editing. Instead, it is often to apply the alpha matting technique as a postprocessing step to refine the object boundary in this case. The goal of alpha matting is to calculate a soft segmentation to separate the foreground and background as accurately as possible. Since the object in a physical scene may have a finer spatial resolution than the size of a discretized image pixel, one pixel may contain a mix of both the foreground and background information [107].
The soft segmentation can be represented in the form of:

I(x, y) = α(x, y)F(x, y) + [1 − α(x, y)]B(x, y), (3.66)

where I(x, y) is the observed image value at pixel (x, y), F(x, y) are B(x, y) are the foreground and background values at (x, y), and 0 ≤ α(x, y) ≤ 1 is the alpha matte function. This model was first proposed in [108] for the purpose of anti-aliasing in image segmentation. Each pixel along the object boundary is the blending result of foreground and background colors on the boundary, and the alpha value controls the weight of the foreground color.
When we calculate Eq. (3.66) in the color space, there are 7 unknowns (namely, colors of F(x, y) and B(x, y) and the alpha value α(x, y)) to be determined with only known value I(x, y) at each pixel location. Thus, the problem is ill-posed. Many regularization schemes have been proposed for Eq. (3.66) by setting constraints on F, B and α [13, 107, 109, 110]. Generally speaking, the constraints on F(x, y), B(x, y) are based on the assumption that F(x, y) and B(x, y) are smooth functions.
A couple of matting models have been proposed such as the Poisson matting, Bayes matting, RW matting, closed-form matting, robust matting, etc. Sometimes, user input is required to identify foreground, background, and transitional regions, which are referred to as the trimap [13]. In the current context, a trimap can be generated automatically from the segmentation step by extracting the object boundary bound [103]. The Soft Scissor (SS) [111] offers a real-time interactive matting tool, and it is implemented as Digital Film Tools PowerMask (http://www.digitalfilmtools. com/powermask/) with a user input similiar to that of edge-based Intelligent Scissors [41].
Another efficient boundary refinement technique is to apply an active contour method with an additional constraint [112] that is used to determine a global optimal object boundary. It aims to locate the most likely object boundary by considering both the boundary and the regional information. For segmentation methods with a hard segmentation result, a probability map based on the GMM models of the foreground and background colors is first constructed. Then, the constrained active contour technique is adopted to find the optimal boundary location. As shown in [112], this technique is effective in improving randow-walk methods [6] and the geodesic segmentation method [13] by generating an improved hard segmentation result.
To conclude, boundary editing and isolated region merging offer two postprocessing solutions in image segmentation with explicit control. The alpha matting technique is particularly efficient in handling hairy and furry complex boundaries.

References:

https://link.springer.com/content/pdf/10.1007/978-981-4451-60-4.pdf

Page updated

Google Sites

Report abuse

This site uses cookies from Google to deliver its services and to analyze traffic. Information about your use of this site is shared with Google. By clicking "accept", you agree to its use of cookies. Cookie Policy

Reject