USING TOBOGGAN-BASED INTELLIGENT SCISSORS FOR IMAGE AND MOVIE EDITING

更新时间:2023-04-28 08:41:01 阅读量: 实用文档 文档下载

说明:文章内容仅供预览,部分内容可能不全。下载后的文档,内容与下面显示的完全一致。下载之前请确认下面内容是否您想要的,是否完整无缺。

SIMULTANEOUS MULTI-FRAME SUBPIXEL BOUNDARY DEFINITION USING TOBOGGAN-BASED INTELLIGENT SCISSORS

FOR IMAGE AND MOVIE EDITING

by

Eric N.Mortensen

A dissertation submitted to the faculty of

Brigham Young University

in partial fulfillment of the requirements for the degree of

Doctor of Philosophy

Department of Computer Science

Brigham Young University

September2000

Copyright?2000Eric N.Mortensen All Rights Reserved

BRIGHAM YOUNG UNIVERSITY

GRADUATE COMMITTEE APPROVAL

of a dissertation submitted by

Eric N.Mortensen

This dissertation has been read by each member of the following graduate committee and by majority vote has been found to be satisfactory.

Date William A.Barrett,Chair

Date Bryan S.Morse

Date Parris K.Egbert

Date Quinn O.Snell

Date Michael A.Goodrich

BRIGHAM YOUNG UNIVERSITY

As chair of the candidate’s graduate committee,I have read the dissertation of Eric.N. Mortensen in its final form and have found that(1)its format,citations,and bibliographi-cal style are consistent and acceptable and fulfill university and department style require-ments;(2)its illustrative materials including figures,tables,and charts are in place;and (3)the final manuscript is satisfactory to the graduate committee and is ready for submis-sion to the university library.

Date William A.Barrett

Chair,Graduate Committee

Accepted for the Department

J.Kelly Flanagan

Graduate Coordinator

Accepted for the College

Nolan F.Mangelson

Associate Dean,

College of Physical and Mathematical Sciences

ABSTRACT

SIMULTANEOUS MULTI-FRAME SUBPIXEL BOUNDARY DEFINITION USING TOBOGGAN-BASED INTELLIGENT SCISSORS

FOR IMAGE AND MOVIE EDITING

Eric N.Mortensen

Department of Computer Science

Doctor of Philosophy

Intelligent Scissors is an interactive image segmentation tool that allows a user to select piece-wise globally optimal contour segments(based on an optimal path search in a graph)that correspond to a desired object boundary.This dissertation uses tobogganing to raise the granularity of the image primitive above the pixel level,producing a region-based basic processing unit that is object-centered rather than device-dependent.The resulting region-based elements form the basis for several contributions to the field of computer vision general and to Intelligent Scissors in particular.These contributions reduce the human time and effort needed for object selection with Intelligent Scissors while simultaneously increasing the accuracy of boundary definition.

The region-based image primitives resulting from tobogganing form the basis for a graph formulation that is many times smaller than the pixel-based graph used previously

by Intelligent Scissors,thus providing faster,more interactively responsive optimal path computations.The object-centered atomic units also provide an efficient and consistent framework in which to compute a4-parameter edge model,allowing subpixel boundary localization,noise-independent edge blur adjustment,and automatic alpha matte genera-tion and color separation of boundary transition pixels.The increased size of the basic pro-cessing unit also facilitates an edge confidence measure that forms the basis for two new techniques called confidence threshold snapping and live-wire path extension,which fur-ther reduce the human burden involved with object boundary definition by automatically finding and following object boundaries.Finally,this dissertation presents a new paradigm for simultaneously interacting with multiple frames from a temporal image sequence by parallelizing both the user input and the interactive visual feedback,thus allowing a user to interact with a montage of image frames in order to define the boundary of a moving object while adhering to the same interactive style that has demonstrated to be effective for the single-image Intelligent Scissors.

ACKNOWLEDGMENTS

I would like to sincerely thank the many people who,through their encouragement, counsel,support,and advice,have made it possible for me to complete this work.To my advisor,Dr.William Barrett,who,by his patient instruction,has guided me along the road to completion,to Dr.Bryan Morse who has always been willing to discuss ideas,and to Dr.Parris Egbert for his friendship and encouragement.

I would particularly like to thank my dear wife Christina and my four children for their patience,support,and understanding through the years.Finally,my most sincere gratitude goes to God for the many blessings I have received at his hand.

Contents

Contents (viii)

List of Tables (xiii)

List of Figures (xiv)

1Introduction (1)

1.1Statement of the Problem (3)

1.2Thesis Statement (8)

1.3Format and Chapter Overviews (10)

2Background and Related Work (13)

2.1Segmentation vs.Selection (14)

2.2Desirable Properties of User-Guided Segmentation Techniques..18

2.3Classification of Segmentation Techniques (20)

2.3.1Feature-Based (20)

2.3.2Region-Based (21)

2.3.3Edge-Based (24)

2.4Related Work in Edge-Based Boundary Finding (27)

2.4.1Single-Frame:Dynamic Programming and Graph Searching (27)

viii

CONTENTS ix

2.4.2Single-Frame:Snakes and Active Contours (32)

2.4.3Single-Frame:Live-wire and Intelligent Scissors (37)

2.4.4Multi-Frame (39)

2.5Limitations of Previous Approaches (44)

2.6Approach Presented in this Dissertation (45)

3Toboggan-Based Intelligent Scissors (47)

3.1Tobogganing (49)

3.1.1Discontinuity Measure:Multi-scale gradient magnitude (49)

3.1.2Sliding and Grouping (56)

3.1.2.14-connected vs.8-connected neighborhood (56)

3.1.2.2Sliding policy and multiple neighboring minima (57)

3.1.2.3Prefilter the discontinuity image (61)

3.1.3Problem Areas (63)

3.2Weighted Graph Creation (65)

3.2.1Edge and Node Definition (65)

3.2.2Topological Special Cases (70)

3.2.3Edge Weights (75)

3.2.3.1Scaling edge costs (76)

3.2.3.2Multi-scale gradient magnitude:f g (78)

3.2.3.3Student’s t-test:f t (79)

3.2.3.4Path Curvature(Bend):f b (81)

3.3Optimal Graph Search (85)

3.3.1Restricting Maximum Edge Length (85)

3.3.2Minimum Spanning Tree (86)

3.3.3Two-Level Active List (89)

3.3.4Edge-Based Seed Points (91)

3.3.5Bending Cost Considerations (92)

CONTENTS x

3.4Live-Wire Boundary Selection (96)

3.4.1Simple Free-Point Selection (97)

3.4.2Edge Selection and Minimum Seed-Point Boundary Definition (98)

3.4.3Interleaving Graph Expansion and Event Processing (102)

3.4.4Backup (102)

4Four-Parameter Edge Model (105)

4.1Previous Work (107)

4.2Edge Model (111)

4.3Extracting Profile Data (115)

4.3.1Reduction of Dimensionality (115)

4.3.1.1Domain projection vector:v x (116)

4.3.1.2Range projection vector:v y (117)

4.3.2Pixel Sampling (118)

4.3.3Translating the projection origin (119)

4.4Functional Fitting (122)

4.4.1Measurement Errors:σi (122)

4.4.2Gradient Descent (127)

4.4.3Local Quadratic Fit (128)

4.4.4Marquardt Minimization (129)

4.4.5Initial Parameter Values:Θ0 (131)

4.5Subpixel Boundary Localization and Polyline Fit (133)

4.5.1Smoothing the Polyline Boundary (135)

4.5.2Interactively Adjusting the Smoothing Factor (139)

4.6Color Separation and Alpha Blending (141)

4.6.1Background (143)

4.6.2Extracting Objects from Existing Images (144)

CONTENTS xi

4.6.3Geometric(Model-Based)Alpha Blending and Compositing (145)

4.7Adjusting Edge Blur (150)

5Free-Point Splitting and Confidence Snapping (155)

5.1Preliminaries (157)

5.1.1Branch Extension Cost (159)

5.1.2Edge Confidence (161)

5.2Advanced Free-Point Splitting (163)

5.2.1Fixed-Length Live-Wire Path Extension (163)

5.2.2Virtual Free-Point Placement (165)

5.2.3Adaptive-Length Live-Wire Path Extension (172)

5.3Cursor Snapping via Confidence Thresholding (181)

5.3.1Fixed-Size Neighborhood Membership (182)

5.3.2Trimmed Neighborhood Membership (187)

5.3.3Subtree Creation and Nearest Snap Point (189)

6Temporal Sequence Extensions (195)

6.1Multi-Frame Live-Wire (197)

6.1.1Multi-Frame Montage (197)

6.1.2Montage Live-Wire (199)

6.2Multi-Frame Free-Point Correlation (204)

6.2.1Uncorrelated Free-Points (204)

6.2.2Correlated Free-Points (206)

7Results and Discussion (211)

7.1Results (213)

7.1.1Single Image (213)

7.1.2Temporal Sequence (223)

CONTENTS xii

7.2Discussion (228)

7.2.1Efficiency (228)

7.2.2Accuracy (229)

7.2.3Reproducibility (235)

8Conclusions and Future Directions (239)

8.1Contributions (241)

8.2Future Work (244)

Bibliography (253)

Appendix A:User Interface and Command-Line Options (269)

List of Tables

7.1Efficiency comparison between pixel-and toboggan-based Intelligent Scissors.222 7.2Accuracy comparison between pixel-and toboggan-based contours (234)

xiii

List of Figures

2.1Automatic segmentation hierarchy (15)

2.2Automatic segmentation results (16)

3.1Kernel size vs.convolution time for both2-D and1-D kernels (55)

3.2Tobogganing (58)

3.3Strict less-than sliding rule within a“flat”area of gradient terrain (59)

3.4Tobogganing using Algorithm3.1 (61)

3.5Low-reject filtering of the gradient magnitude (62)

3.6Edge localization problems (64)

3.7Graphical illustration of Eq.(3.20) (67)

3.8Graph formation (68)

3.9Special cases in graph formation (73)

3.10Two example computations of Eq.(3.40) (82)

3.11Graphical formulation of h b(x),g b(x),and map b(i) (84)

3.12Example long edge broken into shorter edges (86)

3.13Modified(linear-time)insertion sort for an example bin (90)

3.14Example of how the bending cost can compromise optimality (93)

3.15Minimum spanning tree of the graph created from the block image (99)

3.16Boundaries defined with fewer than two seed points(i.e.,one or zero) (101)

xiv

LIST OF FIGURES xv 4.1Graphical formulation of Eq.(4.3)without noise (112)

4.2Graphical comparison between Eqs.(4.3)and(4.4) (113)

4.3Sampling the edge profile data (120)

4.4Unsmoothed subpixel boundaries (134)

4.5Polyline smoothing of subpixel boundary points (138)

4.6Interactive adjustment of smoothing scale (140)

4.7Cut and paste artifacts (142)

4.8Separation of I s into n s,B′s,and F′s for a two-color example (148)

4.9Compositions using traditional and geometric alpha blending (149)

4.10Example of blurring and sharpening without affecting the noise (152)

4.11Contour sharpening results on real world images (153)

5.1Graphical formulation of Eq.(5.17) (162)

5.2Graphical illustration of Eq.(5.21) (165)

5.3Minimum spanning tree of the graph created from the block image (166)

5.4Object boundary with multiple non-spanning edges (167)

5.5Single seed point boundary definition using live-wire path extension (169)

5.6Three problems of an incorrect path extension length (171)

5.7Major branches of a spanning tree (173)

5.8Example adaptive-length path extension (178)

5.9Results using adaptive-length live-wire path extension (180)

5.10Region expansion for confidence threshold snapping (186)

5.11Comparison between fixed and trimmed snap radii (188)

5.12Various examples of confidence threshold snapping (192)

LIST OF FIGURES xvi 6.1Montage display for multi-frame live-wire (198)

6.2Multi-frame free-point propagation and snapping (200)

6.3Temporal sequence object selection:a simple case (205)

6.4Results of using uncorrelated free-points (207)

6.5Free-point correlation via linear interpolation (210)

7.1Test image1:Pixel-vs.toboggan-based Intelligent Scissors boundaries (214)

7.2Test image2:Pixel-vs.toboggan-based Intelligent Scissors boundaries (215)

7.3Desktop scene:Pixel-vs.toboggan-based Intelligent Scissors boundaries (217)

7.4Spinal vertebrae:Pixel-vs.toboggan-based Intelligent Scissors boundaries (218)

7.5Left ventricle:Pixel-vs.toboggan-based Intelligent Scissors boundaries (219)

7.6Tulip:Pixel-vs.toboggan-based Intelligent Scissors boundaries (220)

7.7Still life image:Pixel-vs.toboggan-based Intelligent Scissors boundaries (220)

7.8Sailboats:Pixel-vs.toboggan-based Intelligent Scissors boundaries (221)

7.9Test sequence results using multi-frame Intelligent Scissors (224)

7.10Space ship sequence results using multi-frame Intelligent Scissors (227)

7.11Synthetic test image of circle for measuring segmentation accuracy (231)

7.12Comparison of the boundary accuracy for Fig.7.11(b) (232)

7.13Comparison of the boundary accuracy for Fig.7.11(c) (233)

7.14Desktop scene:Comparison of pixel-and toboggan-based contours (234)

7.15Vertebrae:Comparison of pixel-and toboggan-based contours (235)

7.16Tulip:Comparison of pixel-and toboggan-based contours (236)

Chapter1

Introduction

The ability to segment one or more images presented to a vision system(whether biologi-cal,optical,electronic,or otherwise)is a key step in image understanding.Image segmen-tation is the process of defining the regions and/or boundaries of one or more objects of interest in a digital image so that they can be separated from each other and the back-ground.The ultimate goal of many computer vision systems is to“understand”an image, or set of images,by creating a descriptive representation of the image(s).As such,accu-rate and timely image segmentation is a key requirement of many higher-level vision sys-tems.

This dissertation presents improvements to Intelligent Scissors[110,112],an interac-tive,general purpose,digital image selection tool that allows rapid and accurate object extraction from arbitrarily complex backgrounds using simple gesture motions with a mouse.The goal of this work is not only to extend the utility for selecting an object in a single image but also to expedite object extraction in temporal image sequences for subse-quent image composition operations.In doing so,the desire is to create a single tool that

1

2CHAPTER1.INTRODUCTION allows a user to define a moving object in a temporal sequence by exploiting the tech-niques used to extract a still object from a single image.By displaying several frames of a movie clip as a montage,the user is able to use the same interactive style of Intelligent Scissors to simultaneously interact with all the visible frames.As the pointer is moved in an“active”frame,the live-wire is updated and displayed in all frames.

To achieve the necessary interaction with multiple image frames,this dissertation uses tobogganing[49,183]to presegment an image into small,homogeneous regions.These regions become the basic primitive in a weighted graph formulation used for interactive live-wire selection,thereby raising the granularity of the segmentation problem from the pixel level to a fundamental unit that(usually)contains multiple connected pixels.The resulting region-based graph is several times smaller than the previous pixel-based version of Intelligent Scissors,allowing for a more responsive optimal path computation within a multi-image,interactive live-wire environment.In addition,the region-based graph allows for edge costs that incorporate region-based features and more advanced curvature met-rics.Further,tobogganing provides a consistent and efficient mechanism for computing a four-parameter edge model that estimates subpixel boundary position,edge blur,and fore-ground/background color.Finally,the region-based graph also facilitates a“proactive”path extension and a more intelligent cursor snapping in addition to providing a frame-work for future enhancements such as object selection,free-point sub-tree matching,live-wire feature-cost coupling,etc.

1.1.STATEMENT OF THE PROBLEM3 1.1Statement of the Problem

Fully automatic general image segmentation is an unsolved problem due to the wide variety of image sources,content,and complexity.In fact,any general purpose image seg-mentation tool—such as the selection tools used in image manipulation packages like Adobe Photoshop[1]—will continue to require some degree of human guidance due to the essential role of the user in identifying which image components are of interest.Thus, intelligent tools that exploit high-level visual expertise but require minimal user interac-tion become appealing[126].

With the increasing use of digital images and movies in web design,entertainment, medical analysis,virtual environment creation,etc.,the demand will continue to grow for fast and easy-to-use segmentation tools that accurately select,extract,measure,and/or define objects of interest.Unfortunately,basic and often tedious tools such as lassoing and magic wand are still widely used when a complex image component must be separated from a nontrivial background.For this reason,research continues to search for better selection tools that can reduce the human burden.

Recently,this author,along with Dr.William Barrett,introduced a unique boundary based segmentation technique called Intelligent Scissors[110,112],which allows rapid and accurate object extraction using simple gestures motions with a mouse.Based on the live-wire interactive optimal path selection tool[10-11,109,111],Intelligent Scissors per-forms general purpose,interactive object segmentation by allowing the user to choose a minimum cost contour segment corresponding to a portion of the desired object boundary. As the mouse position comes in proximity to an object edge,a live-wire boundary“snaps”to and wraps around the object of fc625a3510661ed9ad51f355pared to manual tracing(i.e.,lassoing),

4CHAPTER1.INTRODUCTION object selection using Intelligent Scissors is many times faster,more accurate,and more reproducible[112].

Nevertheless,while Intelligent Scissors,as presented in[10-11,110-112],reduces (often greatly)the user input needed to define an object boundary compared to previous techniques,it by no means represents the minimum amount of human effort required for object selection.Further,even Intelligent Scissors becomes tedious when trying to define, on a frame-by-frame basis,the boundary of a moving object in a temporal sequence of images.Consequently,this dissertation presents several extensions to the Intelligent Scis-sors object selection tool with the primary goal of reducing the user input required to accu-rately define object boundaries and a secondary goal of facilitating image editing operations.While the original objective is,for the most part,to facilitate boundary defini-tion of a moving object within a temporal image sequence for subsequent editing opera-tions(e.g.,composition),many of these extensions also reduce(sometimes greatly)the amount of human effort needed to segment an object from a single image.The objectives of this work are to:

1.Extend Intelligent Scissors to easily and accurately extract moving or changing

objects from temporal or spatial image sequences.

2.Maintain the advantage Intelligent Scissors has over previous user-guided segmen-

tation techniques—namely,the immediate interactive feedback afforded by live-

wire optimal path selection.

3.Provide a tight coupling of an object’s contour properties between neighboring

image frames.

4.Increase the potential accuracy of Intelligent Scissors as well as to facilitate more

convincing compositions by providing subpixel boundary definition,automatic

edge blur estimation,and foreground/background color separation.

5.Improve Intelligent Scissors’general utility by providing more intelligent cursor

snapping and introducing“proactive”live-wire to reduce the effort in placing seed points and the number of seed points needed.

1.1.STATEMENT OF THE PROBLEM5 The end goal is to extend Intelligent Scissors in order to extract moving objects from time-varying images with little or no more effort than is needed to select an object in a still image.In doing so,we wanted to adhere to the unique interactive style for which Intelli-gent Scissors is known—the immediate live-wire feedback.Thus,rather than define an object boundary in one frame and subsequently“batch”process the other frames,multi-frame Intelligent Scissors displays several frames as a montage and allows the user to interact with all visible frames simultaneously.As the user gestures in one frame,a live-wire is computed and displayed in all visible frames.Further,the user can easily jump between frames to make local adjustments during live-wire selection.

In terms of the effort required to define a moving object boundary through a temporal image sequence,a tool that requires a user to manually place seed points in every frame provides little or no benefit over single image Intelligent Scissors.Ideally,a user interacts within a single frame and Intelligent Scissors tracks the object through all frames and immediately displays the resulting boundary in process.Tracking of object boundaries typically requires matching of object features between neighboring images.Thus,another objective of this work is to explore methods of interactively matching free-point features through the sequence.Since matching occurs during boundary definition,information about the object—such as the foreground color and object shape in the vicinity of the point being matched—is not explicitly available.Though point matching techniques abound [172,147,185],many of them fail to choose the correct match when multiple“high confi-dence”possibilities are available or when local properties change.Without some higher-level knowledge of the object and how it moves,it is difficult to provide reliable matches of edge points in a dynamic system.Fortunately,the interactive nature of multi-frame

本文来源:https://www.bwwdw.com/article/4e7q.html

Top