DESIGN METHODS FOR FPGA-BASED IMPLEMENTATION OF COMBINATORIA
更新时间:2023-04-16 14:26:01 阅读量: 实用文档 文档下载
- design推荐度:
- 相关推荐
DESIGN METHODS FOR FPGA-BASED IMPLEMENTATION OF COMBINATORIAL SEARCH
ALGORITHMS
Iouliia Skliarova, Valery Sklyarov 1)
Abstract
Many combinatorial search problems can be efficiently formulated and solved over Boolean and ternary matrices that keep the initial and all the required intermediate data. Three of such problems, namely, the Boolean satisfiability, covering and graph coloring, have been examined and analyzed. The results of analysis make it possible suggesting a number of reusable functional blocks for the considered search algorithms, which provide significant support for the design of FPGA-based digital systems allowing miscellaneous combinatorial problems to be solved. The proposed technique enables the designers to concentrate their efforts on the algorithms and to minimize the design time required for less significant details of hardware implementation.
1. Introduction
There are many practical problems that can be formulated over such mathematical models as graphs, discrete matrices, sets, Boolean equations, etc. Majority of these models are mutually convertible, i.e. any of them might be selected for similar purposes. Many practical problems can be solved by applying various algorithms of combinatorial search over a chosen mathematical model. Such algorithms have two distinctive features. Firstly, they require considering a huge number of different variants. Secondly, these variants can be ordered and examined with the aid of a decision (search) tree that provides an efficient way for handling intermediate solutions.
Examples of combinatorial problems that can be solved with the aid of the mentioned above algorithms are Boolean satisfiability [5,9]; covering of Boolean matrices [7]; graph coloring [1], etc. Their results have been used for a number of practical applications such as electronic design automation [6], mobile computing [18], multimedia processors [16], embedded [4] and distributed [3] systems, formal verification [17] and many others [5,9]. The detailed analysis of combinatorial algorithms, their practical applications and publications in this area was done in [5,6,9,19].
1 Department of Electronics, Telecommunications and Informatics, IEETA, University of Aveiro, 3810-193 Aveiro, Portugal, iouliia@det.ua.pt, skl@det.ua.pt
Systems for solving combinatorial problems might be implemented in field-programmable devices (FPGA, in particular). The latter have a number of advantages, which have appeared because the considered tasks possess a set of specific features. Firstly, any task involves a huge number of similar operations. As a rule, these operations are not the same for different combinatorial problems. Thus, it is not easy to construct a universal combinatorial processor, i.e. processor’s instructions have to be customized for a particular problem that is going to be solved. This can easily be done with the aid of FPGA technology. Secondly, different practical applications might require solving combinatorial tasks with varying complexity. However, optimal results can be achieved in case if the size of processor operands permits any required operation to be performed in one clock cycle. Parameterizable circuits that provide such an opportunity can easily be implemented in FPGA. Thirdly, FPGA enable us to build on the same microchip any desired (customized) interface between a combinatorial accelerator and a general-purpose computational system (or any specialized system that requires an accelerator). Fourthly, the complexity of recent FPGA allows for the construction of a complete system-on-chip and a combinatorial accelerator can be implemented as an application-specific co-processor within this system.
Note that hardware permitting to carry out combinatorial search over various (alternative) models requires different FPGA resources, i.e. the complexity of the respective circuits depends essentially on the adopted mathematical model. Our experience has shown that a discrete (Boolean and ternary) matrix can be seen as a very adequate model for the considered computations [8,10].
This paper suggests a number of reusable blocks that can be employed for constructing application-specific hardware processors that permit combinatorial problems formulated over Boolean and ternary matrices to be solved. The blocks have been selected on the basis of analysis of different search algorithms and their primary operations. It is shown that such algorithms require recursive procedures that repeat the following basic steps, which are 1) reduction of a given or an intermediate matrix permitting to decrease the problem complexity; 2) selection of matrix elements (rows or columns) that incrementally compose a solution (or enable to conclude that the solution does not exist); 3) branching in case if more than one matrix element has to be selected; 4) termination of a branch in case if it is proved that the selected path does not permit neither to find a solution nor to improve a previously discovered solution. The selected blocks have been implemented both as Handel-C macros [2] and VHDL library modules.
2. Template for Combinatorial Search Algorithms
Figure 1 depicts the basic skeletal algorithm (template) of combinatorial search, which will be detailed below on examples of 1) Boolean satisfiability problem; 2) matrix covering problem; and 3) graph coloring problem.
Figure 1. Template for combinatorial search algorithms.
The algorithm is presented in form of hierarchical graph-schemes (HGS) [12]. The module Z0 in figure 1 describes the top-level recursive algorithm. The module Z1 implements a problem specific reduction technique, which possesses an important problem-independent feature: Z1 operates over the same instance of a Boolean/ternary matrix and the required reduction is provided through masking some of the matrix rows and columns. The module Z2 implements a problem specific selection technique and operates also over the considered above matrix. Problem specific operations are executed with the aid of a dynamically reconfigurable unit.
2.1. Boolean Satisfiability
Let us assume that a matrix U shown in figure 2 is given. Conversion between Boolean formulae and ternary matrices is shown in [10]. To solve the Boolean satisfiability problem it is necessary to find out a ternary vector w that is orthogonal to all rows of the matrix U. Orthogonality of two ternary vectors is defined as follows [19]: (m i ort m j) ? {m i} ∩ {m j} = ?, where {m i} ({m j}) is a set of Boolean vectors that correspond to the ternary vector m i(m j) by replacing the don’t care values (‘-‘) with all possible combinations of 0s and 1s. If and only if two ternary vectors are not orthogonal (let us designate this as m i ort m j) they do intersect in the Boolean space: (m i ort m j) ?(m i ins m j).
Basic operations of the algorithm in figure 1 realize so-called reduction and selection rules. For the Boolean satisfiability problem the following reduction rules can be applied, which permit an initial matrix to be simplified [11]:
1.If a column contains just don’t care values it must be deleted from the matrix.
2.All rows that are orthogonal to an intermediate vector w (that incrementally forms a solution)
must be removed from the matrix. All columns that correspond to the components of the vector w with values ‘1’ and ‘0’ must be deleted from the matrix.
3.If the matrix contains a row with just one component ‘0’ (‘1’) with an index i then the
element i of the vector w must be assigned the value ‘1’ (‘0’), i.e. the inverted value.
4.If there is a column j in the matrix without values ‘1’ (‘0’) then the element j of w can be
assigned the value ‘1’ (‘0’).
The selection rules make possible to split the problem into sub-problems and to examine them in turn. For the Boolean satisfiability problem the following selection rules can be applied [11]. A column c max, which contains the maximum number N1+ N0 of values ‘1’ (N1) and ‘0’ (N0) is selected. If N1≥ N0 then the ‘0’ is assigned to the element max of w, otherwise (if N0 > N1) the value ‘1’ is assigned to the same element if w. It permits to construct a minor (sub-matrix) which will be examined at the next step.
Figure 2 illustrates how these rules can be applied to a practical example (the numbers of rules are shown in circles). The sequence of steps is marked by letters a, b, …, g. The vector w is incrementally constructed in an order indicated by the numbers 1-5 enclosed in squares. The last number 5 detects a solution.
Figure 2. Using the algorithm in figure 1 for solving the Boolean satisfiability problem.
2.2. Matrix Covering
Let us consider now how to use the same template in figure 1 to solve another known combinatorial problem that permits to find out a minimal column cover of a Boolean matrix [10]. Suppose that it is given a matrix shown at the top-left corner of figure 3. Columns a,b and d represent one
possible solution, i.e. the minimal number of columns that have at least one value ‘1’ in each row of the matrix. The following set of rules that permit to reduce (simplify) the matrix can be used:
1.If for i≠j, row i & row j = row j then the row i can be removed from the matrix, for example,
row6 & row5 = row5 and row6 has to be removed from the matrix.
2.If for i≠j, column i & column j = column i then the column i can be removed from the matrix.
3.If there is a row, which does not have values ‘1’ then covering cannot be found.
For the selection purposes the following rules can be used:
1.If a row has just one value ‘1’ then the respective column (i.e. the column containing this ‘1’)
must be included into the covering and deleted from the matrix. All the rows that have value ‘1’ in this column must also be deleted from the matrix.
2.If all rows have more than one value ‘1’ then the first row from the top of the matrix that
contains the minimum number of ones has to be selected. For this row it is necessary to analyze all possible branches and the number of such branches is equal to the number of values ‘1’ in the row. Obviously, any branch has to be examined until the step where an intermediate result becomes equal to or worse than any previously discovered covering.
Figure 3. Using the algorithm in figure 1 to find out the minimal column cover of the matrix.
Figure 3 shows all the steps that are required in order to find out the solution for the given matrix. There is just one branching point in figure 3: b-e. After getting the first solution {a,b,d} we are interested just in coverings composed of 2 or less columns. Thus, it is not necessary to traverse all branches and forward propagation can be stopped at any point that gives a 2-component incomplete solution.
2.3. Graph Coloring
For solving the graph-coloring problem it is necessary to paint graph nodes in such a way that: 1) any two nodes connected with an edge are painted in different colors; 2) the number of used colors is minimal. The details of converting a graph to a ternary matrix are given in [15]. The resulting matrix μ(G) has the same number of rows as the number N of vertices in the graph G and at maximum N-1 columns. If and only if two vertices i and j are connected with an edge in G then the matrix rows m i and m j must be orthogonal.
The algorithm permitting to solve the graph coloring problem consists in finding the minimal number K of such subsets of rows in the matrix μ(G) that any subset does not contain mutually orthogonal rows. In other words, all the rows belonging to the same subset must intersect in the Boolean space. The number K is the minimal number of colors for the graph G. Rows from each subset correspond to the vertices of the graph G that can be painted in the same color.
Some of the reduction and selection rules can be directly borrowed from the method of condensation, proposed in [19]. The following reduction rules can be used:
1.If, after selecting a new color, the matrix μ(G) contains a column without values ‘0’ (‘1’) then
this column can be deleted;
2.If the matrix μ(G) contains a row with just don’t care values then this row can be deleted
from the matrix and included in the constructed subset;
3.All the rows that have already been included in the constructed subsets are removed from the
matrix μ(G).
The coloring algorithm consists in the application of the following steps [15]:
a)Choose a new color (i.e. create a new initially empty subset);
b)Apply the reduction rules;
c)Consider the topmost row m i in the matrix;
d)Include the row m i in the constructed subset and delete it from the matrix;
e)Find out all other rows intersecting (i.e. not orthogonal) with the vector m i;
f)Select the first not tried yet row m j from point e), include it in the constructed subset and then
delete it from the matrix;
g)Assign m i = m i ins m j and repeat the steps e)-g) if this is possible. If this is not possible go to
the step h);
h)If the intermediate matrix is not empty repeat the steps a),…,g). Otherwise, store the solution
found and then backtrack to the nearest branching point (set at the step f) and try to find a better solution by repeating the steps f)…h).
2.4. Common Features of Combinatorial Search Algorithms
The considered above search algorithms have similar characteristics. Their distinctive feature is the execution of problem-specific operations and traversing a decision (search) tree starting from its root by involving such procedures as forward search and backtracking. Any branching point can be considered as extracting a sub-tree with a local root. These algorithms possess several common features:
1. They can be formulated recursively, therefore a recursive control unit might be very helpful.
2. They do not change the initial data (i.e. the initial matrix) because the matrix reduction can be
provided by masking some rows/columns and using just the remainder of the matrix.
3. They invoke a very limited number of operations (such as reduction and selection operations
considered above), which have to be applied to a huge volume of data.
4. Subsets of the required operations are usually not the same for different combinatorial
problems. Thus, a reconfigurable control unit is required which permits to customize a sequence of control steps for a particular combinatorial problem.
5. In order to perform forward and backward propagation we can use a stack memory that stores
and restores intermediate results (such as the values of mask registers) in branching points. 6. The algorithms can be decomposed into two levels of control operations. The top-level
(recursive) sequence is the same for different algorithms (see figure 1). The bottom level sequence permits the required operations over Boolean and ternary vectors to be executed. As
a rule, these operations are not the same for different algorithms and it requires changes to the
functionality of the respective circuit.
3. Specification of Reusable Functional Blocks
An analysis of combinatorial search algorithms applied to Boolean and ternary matrices allowed to select the following functional blocks (FB) for implementing these algorithms in hardware:
1.Memory permitting to store Boolean and ternary matrices and to provide direct access to both
rows and columns.
2.Mask registers allowing to use the same storage for handling the initial matrix and all the sub-
matrices (minors), which can be constructed by removing some rows/columns.
3.Stacks for managing forward and backward propagation steps, which permit to construct sub-
matrices sequentially and to return back to any intermediate sub-matrix if required. These stacks have to store masks for the sub-matrices and the contents of general purpose registers.
4.General-purpose registers for keeping Boolean/ternary vectors.
5. A device for computations over Boolean and ternary vectors.
6.Control circuits, which make possible to execute operations at two levels: a) operations over
either one or two Boolean/ternary vectors; and b) operations over the entire matrices.
7.Additional auxiliary circuits for testing, debugging and interacting with the considered
system targeted to combinatorial search algorithms over Boolean and ternary matrices.
To make the considered above blocks reusable we have to provide them with the properties of parameterization and reconfigurability. The first property allows for scalability in such a way that the considered blocks can be applicable to matrices of different dimensions (i.e. with different number of rows and columns). The second property permits to configure control circuits at any level in such a way that different operations over vectors can be chosen.
Figure 4 depicts the proposed architecture of a reconfigurable processing unit for executing combinatorial search algorithms over Boolean and ternary matrices.
Figure 4. Processing unit constructed from the functional blocks (FB) indicated in the points 1-7.
The following assumptions have been taken into account:
1. A ternary matrix M t is represented by two binary matrices M b0 and M b1, where M b0 contains
the values ‘1’ in the positions that have the values ‘0’ in M t, and M b1 contains the values ‘1’ in the positions that have the values ‘1’ in M t.
2.Dual access to the matrix rows/columns is provided through keeping in memory two matrices:
the original matrix and its transpose.
3.Hierarchical modular specification is used to implement control circuits in such a way that an
execution of low-level operations is provided with the aid of modules (sub-algorithms), which can be activated (called) by a control circuit running a higher level algorithm.
4.Changing low-level and high-level algorithms is achieved by reconfiguring the control circuit
modeled by a reconfigurable hierarchical finite state machine [12]. An example of such machine for implementing operations over Boolean and ternary vectors was considered in detail in [14].
4. Implementation Details
A number of FPGA-based circuits for implementing combinatorial search algorithms over Boolean and ternary matrices have been designed from specifications in VHDL and Handel-C and tested in hardware. Different F
B described in VHDL have been examined in [8]. For example, the functional blocks 1-5 indicated in the previous section have been designed and tested in Xilinx FPGA of Virtex-EM family. Hierarchical finite state machines for VHDL and Handel-
C projects have also been designed and tested in Xilinx FPGA of Spartan-II family and the respective results were reported in [13]. Complete Handel-C projects were described and tested for all the problems considered in the paper and the relevant results were reported in [13,15].
The results of experiments have shown that the proposed technique makes possible to shorten essentially the design time of combinatorial processors. The proposed FB take into account many specific features of combinatorial search algorithms and they have been optimized for the considered problems. It makes possible to provide block-based high-level design, i.e. to concentrate the efforts of the designer on the considered algorithms avoiding (or at least minimizing) the details of hardware implementation. Since the proposed reusable blocks were implemented as a set of Handel-C macros and VHDL library modules, it allows to consider either the design flow on the basis of a system-level specification language or a widely-used hardware description language. A set of additional blocks described in Handel-C and VHDL provide very effective visualization and debugging tools and they can be inserted into “debug version” and removed from the “release version”. They are also very helpful for experimental purposes.
5. Conclusion
The paper analyzes different combinatorial problems and demonstrates that they can be formulated over Boolean and ternary matrices. It is shown that many problems from this area might be solved with the aid of search algorithms that have a number of common features. The paper suggests primary building blocks for these algorithms, which provide unique operations over matrices, support stacks and implement recursive control algorithms. These blocks were described in Handel-C and in VHDL and implemented in FPGA. Three combinatorial algorithms for the Boolean satisfiability, covering and graph coloring problems were examined. The results of implementations have demonstrated a significant speedup in the design process and some other advantages such as support for visualization of intermediate results and for the debugging.
6. References
[1] Bibliographies and links on graph coloring, available at: http://www.cs.ualberta.ca/~joe/Coloring/index.
[2] Celoxica products, available at: 9d351226ccbff121dd368393.
[3] Feldman, R., Haubelt, C., Monien, B., Teich, J., Fault Tolerance Analysis of Distributed Reconfigurable Systems
Using SAT-Based Techniques, in: Proceeding of FPL’2003, Portugal, pp 478-487, 2003.
[4] Goossens, G., Praet, J.V., Lanneer, D., et al., Embedded Software in Real-Time Signal Processing Systems: Design
Technologies, in: Proceedings of the IEEE, vol. 85, no. 3, March, 1997, pp. 436-454.
[5] Gu, J., Purdom, P.W., Franco, J., Wah, B.W., Algorithms for the Satisfiability (SAT) Problem: A Survey, in:
DIMACS Series in Discrete Mathematics and Theoretical Computer Science, vol. 35, pp. 19-151, 1997
[6] Marques-Silva, J.P., Sakallah, K.A., Boolean Satisfiability in Electronic Design Automation, in: Proceedings of
DAC, USA, Los Angeles, 2000.
[7] Plessl, C., Platzner, M., Instance-Specific Accelerators for Minimum Covering, in: Proc. 1st Int. Conf. ERSA’2001,
Las Vegas, pp. 85-91.
[8] Skliarova, I., Reconfigurable Architectures for Problems of Combinatorial Optimization, Ph.D. Thesis, University
of Aveiro, Portugal, 2004.
[9] Skliarova, I., Ferrari, A.B., Reconfigurable Hardware SAT Solvers: A Survey of Systems, in: IEEE Transactions on
Computers, vol. 53, issue 11, pp 1449-1461, 2004.
[10] Skliarova, I., Ferrari, A.B., The Design and Implementation of a Reconfigurable Processor for Problems of
Combinatorial Computation, in: Journal of Systems Architecture, Special Issue on Reconfigurable Systems, vol.
49, nos. 4-6, pp, 211-226, 2003.
[11] Skliarova, I., Ferrari, A.B., A Software/ Reconfigurable Hardware SAT Solver, in: IEEE Transactions on Very
Large Scale Integration (VLSI) Systems, vol. 12, n. 4, pp 408-419, 2004.
[12] Sklyarov, V., Hierarchical Finite-State Machines and Their Use for Digital Control, in: IEEE Transactions on
VLSI Systems, vol. 7, no. 2, pp 222-228, 1999.
[13] Sklyarov, V., Skliarova, I., Architecture of a Reconfigurable Processor for Implementing Search Algorithms over
Discrete Matrices, in: Proceedings of the International Conference on Engineering of Reconfigurable Systems and Algorithms - ERSA'2003, T.P. Plaks (ed.), Las Vegas, USA, June 2003, pp. 127-133.
[14] Sklyarov, V., Skliarova, I., Oliveira, A., Ferrari, A.B., A Dynamically Reconfigurable Accelerator for Operations
over Boolean and Ternary Vectors, in: Euromicro Symp. on Digital System Design, Belek, Turkey, pp. 222-229, Sept. 2003.
[15] Sklyarov, V., Skliarova, I., Pimentel, B., Modeling and FPGA-based implementation of graph coloring algorithms,
in: Proc. of the 3rd Int. Conf. on Autonomous Robots and Agents - ICARA 2006, New Zealand, Dec. 2006. [16] Thorup, M., Combinatorial Power in Multimedia Processors, ACM SIGARCH Computer Architecture News, vol.
31, no. 5, December, 2003, pp.6-11.
[17] Velev, M.N., Bryant, R.E., Effective use of Boolean Satisfiability Procedures in the Formal Verification of
Superscalar and VLIW Microprocessors, in: Proceedings of the 38th conference on Design automation, Las Vegas, USA, 2001, pp. 226-231.
[18] Vidyarthi, G., Ngom, A., Stojmenovic, I., Combinatorial Evolutionary Methods in Wireless Mobile Computing,
in: Combinatorial Optimization in Communication Networks, Kluwer Academic Publishers, 2005.
[19] Zakrevski, A.D., Logical Synthesis of Cascade Networks, Moscow: Science, 1981.
正在阅读:
DESIGN METHODS FOR FPGA-BASED IMPLEMENTATION OF COMBINATORIA04-16
团宁委联30号10-19
中考誓师大会校长发言稿最新3篇03-26
初中数学优秀课展评活动资料人教版数学七年级上册《3.1.1 一元一次方程》教学设计 松滋市南海中学 祝爽 - 图文09-18
浅谈当前基础教育失衡发展的现状及对策05-09
文学概论复习资料05-07
入股分红协议书08-10
给排水采暖施工组织设计12-22
第1章 传感与检测技术绪论05-31
中国上下五千年论文06-17
- 1Technical Specification Methods
- 2Technical Specification Methods
- 3第九讲 软件复用与构件化软件开发(Software Reuse and Software Design based Component
- 4Implementation of conformal covariance by diffeomorphism symmetry
- 5Computer Methods in Applied Mechanics and Engineering
- 6sizing in conceptual design at BWM
- 7Design-Expert 8.0使用指南-Multifactor RSM-Optimal design
- 8Interaction Design Group
- 9Interaction Design Group
- 10Simulation in Hardware Design and Testing
- 教学能力大赛决赛获奖-教学实施报告-(完整图文版)
- 互联网+数据中心行业分析报告
- 2017上海杨浦区高三一模数学试题及答案
- 招商部差旅接待管理制度(4-25)
- 学生游玩安全注意事项
- 学生信息管理系统(文档模板供参考)
- 叉车门架有限元分析及系统设计
- 2014帮助残疾人志愿者服务情况记录
- 叶绿体中色素的提取和分离实验
- 中国食物成分表2020年最新权威完整改进版
- 推动国土资源领域生态文明建设
- 给水管道冲洗和消毒记录
- 计算机软件专业自我评价
- 高中数学必修1-5知识点归纳
- 2018-2022年中国第五代移动通信技术(5G)产业深度分析及发展前景研究报告发展趋势(目录)
- 生产车间巡查制度
- 2018版中国光热发电行业深度研究报告目录
- (通用)2019年中考数学总复习 第一章 第四节 数的开方与二次根式课件
- 2017_2018学年高中语文第二单元第4课说数课件粤教版
- 上市新药Lumateperone(卢美哌隆)合成检索总结报告
- IMPLEMENTATION
- COMBINATORIA
- METHODS
- DESIGN
- BASED
- FPGA
- 高考百日冲刺发言稿
- 黑龙江省线下实体店无理由退货指引(试行)
- 双色球之红球的万能组合精选版
- 创业协会新会员见面会策划书
- 三八妇女节户外运动活动策划
- 服装仓储配送物流项目实施计划方案书
- 2022年 特色文化旅游基础设施建设项目可行性研究报告word
- 基于小学道德与法治课与法治教育融合的思考-最新教育资料
- 某投资建设年产1000艘玻璃钢游艇生产项目可行性研究报告
- 高三年级主任述职报告
- 2022年长安大学分析化学(同等学力加试)复试实战预测五套卷
- 液压顶升机项目可行性分析报告(模板参考范文)
- 危险化学品题库--易错题01
- 2015年上半年海南省注会《审计》:内部控制与控制测试考试试卷
- 广东省佛山市足浴服务企业名录2022版291家
- 培养小学生良好习惯实践与研究报告
- 经济学院大学生社会实践报告推荐
- 500千伏输电线路工程项目管理实施规划
- 大乐透历史开奖号码(更新至2022018期)
- 山东省聊城文轩中学2015届九年级下学期第一次阶段性测试物理试卷