唐常杰翻译的计算理论导引14

更新时间:2023-07-22 21:10:01 阅读量: 实用文档 文档下载

说明:文章内容仅供预览,部分内容可能不全。下载后的文档,内容与下面显示的完全一致。下载之前请确认下面内容是否您想要的,是否完整无缺。

唐常杰翻译的计算理论导引PPT

Lecture Notes for Computation Theory Book :《计算理论导引》Introduction to the Theory of Computationby Michael Sipser

Section 6.4

信息的定义 (可考虑部分略讲) Instructor : 唐常杰TangChangjie@

/~chjtang

Students : Ph.D. 2000--2006, SCU

Style可计算理论 2011-6-16

: Lecture / SeminarCS_Dept.Sichaun Univ. 1/77

唐常杰翻译的计算理论导引PPT

可计算理论

2011-6-16

CS_Dept.Sichaun Univ.

2/77

唐常杰翻译的计算理论导引PPT

可计算理论

2011-6-16

CS_Dept.Sichaun Univ.

3/77

唐常杰翻译的计算理论导引PPT

可计算理论

2011-6-16

CS_Dept.Sichaun Univ.

4/77

唐常杰翻译的计算理论导引PPT

OutlineChapter 6.4: Definition of Information by Descriptive / Kolmogorov Complexity 若干补充 Incompressible Strings Chapter 6 Arguments by Incompressibility Learning theory

可计算理论

2011-6-16

CS_Dept.Sichaun Univ.

5/77

唐常杰翻译的计算理论导引PPT

Measuring Information

ep213 cp 146

Standard information theory considers each n bitstring in {0,1}n equal (all have probability 2–n). However, we feel a difference between the string A=“0101010101010101010101010101010101” 有规律地 重复01串 17次,可压缩为 01#17 and the string (of equal length) B=“0101110101001010001110001100011011”. 比较复杂,几句化说不清的,信息量较大 直观感觉: 长话短说 的 最短尺寸,可用来度量其信息量可计算理论 2011-6-16 CS_Dept.Sichaun Univ. 6/77

唐常杰翻译的计算理论导引PPT

Measuring Information ep213 cp 140Standard information theory considers each n bitstring in {0,1}n equal (all have probability 2–n). However, we feel a difference between the string A=“0101010101010101010101010101010101” 有规律地 重复01串 17次,可压缩为 01#17 and the string (of equal length) B=“0101110101001010001110001100011011”. 比较复杂,几句化说不清的,信息量较大 直观感觉: 长话短说 的 最短尺寸,可用来度量其信息量可计算理论 2011-6-16 CS_Dept.Sichaun Univ. 7/77

唐常杰翻译的计算理论导引PPT

Measuring Information ep213

cp 140

Standard information theory considers each n bitstring in {0,1}n equal (all have probability 2–n). However, we feel a difference between the string A=“0101010101010101010101010101010101” 有规律地 重复01串 17次,可压缩为 01#17 and the string (of equal length) B=“0101110101001010001110001100011011”. 比较复杂的,短话说不清的,信息量较大 直观感觉: 表达语义的 最短尺寸,可用来度量其信息量可计算理论 2011-6-16 CS_Dept.Sichaun Univ. 8/77

唐常杰翻译的计算理论导引PPT

RegularityWe can give a short description of “0101010101 010101010101010101010101” by “17 times 01”.

For the other “01011101010010100011100011 00011011” this seems more problematic. Suggests: 规律性使得描述较短(信息量较小)

A regular string is a string that has a short description; an irregular one has no such summary.可计算理论 2011-6-16 CS_Dept.Sichaun Univ. 9/77

唐常杰翻译的计算理论导引PPT

RegularityWe can give a short description of “0101010101 010101010101010101010101” by “17 times 01”.

For the other “01011101010010100011100011 00011011

” this seems more problematic. Suggests: 规律性 使得描述较短(有规律的,信息量较小)

A regular string is a string that has a short description; an irregular one has no such summary.可计算理论 2011-6-16 CS_Dept.Sichaun Univ. 10/77

唐常杰翻译的计算理论导引PPT

Allowed Descriptions

用Pi表示一长串01 Pi表示一长串01

Problem: What do we consider a description? What may be an obvious description for one, may be illegible for somebody else: 3.14…….. 011001001000011111101101010100010001 000010110100011000010001101001100010 011000110011000101000101110000000110 111000001110011010001001010010… We need a proper definition of ‘a description’.可计算理论 2011-6-16 CS_Dept.Sichaun Univ. 11/77

π

唐常杰翻译的计算理论导引PPT

Allowed Descriptions

用Pi表示一长串01 Pi表示一长串01

Problem: What do we consider a description? What may be an obvious description for one, may be illegible for somebody else: 3.14…….. 011001001000011111101101010100010001 000010110100011000010001101001100010 011000110011000101000101110000000110 111000001110011010001001010010… We need a proper definition of ‘a description’.可计算理论 2011-6-16 CS_Dept.Sichaun Univ. 12/77

π

唐常杰翻译的计算理论导引PPT

‘Turing Describable’ Describable’

ep215 cp141规律的描述和 输入

重复17次 01 TM M w 说明可用<M>w的长度来描述信息量 X 向复杂度(信息量)为 Min ({ |<M>|+|w| :: x=M(w), M is TM } )例 用Winzip 压缩A.doc 成为A.zip , 121K ·| < Winzip>|+|A.zip| 能描述Z.Doc的复杂度

Some details need to be sorted out first though…

可计算理论

2011-6-16

CS_Dept.Sichaun Univ.

13/77

唐常杰翻译的计算理论导引PPT

‘Turing Describable’ Describable’

ep215 cp141规律的描述和 输入

重复17次 01 TM M w 说明可用<M>w的长度来描述信息量 X 的 复杂度(信息量)为 Min ({ |<M>|+|w| :: x=M(w), M is TM } )例 用Winzip 压缩A.doc 成为A.zip , 121K ·| < Winzip>|+|A.zip| 能描述Z.Doc的复杂度

Some details need to be sorted out first though… 又例:某个问题用亿次机算,最少要用1个月 TM M time可计算理论 2011-6-16 CS_Dept.Sichaun Univ. 14/77

唐常杰翻译的计算理论导引PPT

‘Turing Describable’ Describable’

ep215 cp141规律的描述和 输入

重复17次 01 TM M w 说明可用<M>w的长度来描述信息量 X 的 复杂度(信息量)为 Min ({ |<M>|+|w| :: x=M(w), M is TM } )例 用Winzip 压缩A.doc 成为A.zip , 121K ·| < Winzip>|+|A.zip| 能描述Z.Doc的复杂度

压缩程序复 杂一些, 压缩出来的 文件可能小 一些

Some details need to be sorted out first though… 又例:某个问题用亿次机算,最少要用1个月 TM M time可计算理论 2011-6-16 CS_Dept.Sichaun Univ. 15/77

唐常杰翻译的计算理论导引PPT

Measuring the size of (M,y)We want to express the size of the TM M and y in a number of bits. But how many bits is a specific (M,y) pair? Other key idea: We fix a universal Turing machine U that,on input <M,y>, simulates the TM M on y (yielding output x).为了一致地比较,用通用图灵机 U,模拟M on y int U (<M,y>) { retur

n ( M(y) ); }

How does the <M,y> encoding work?可计算理论 2011-6-16 CS_Dept.Sichaun Univ. 16/77

唐常杰翻译的计算理论导引PPT

Measuring the size of (M,y)We want to express the size of the TM M and y in a number of bits. But how many bits is a specific (M,y) pair? Other key idea: We fix a universal Turing machine U that,on input <M,y>, simulates the TM M on y (yielding output x).为了一致、公平 地比较,用通用图灵机 U,模拟M on y int U (<M,y>) { return ( M(y) ); }

How does the <M,y> encoding work?可计算理论 2011-6-16 CS_Dept.Sichaun Univ. 17/77

唐常杰翻译的计算理论导引PPT

An Encoding for <M,y>

ep214, cp147 思想 :0,1和分隔符号,需三符号,可用三进位, 但这里巧用 二进位编码 0,1和分隔符号,需三符号,可用三进位,

The description of the TM M and its input y is going to be one long bit-string:M 分隔符 w

1 1 0 0 L L L 1 0 1 0 1 L 0 1_ _ _ 1 4 4 4 4 4 4 3 1 4 4 3 4 4 4 2 4 4 4 4 2 4Turing machine M input y

How do we know where <M> stops and <y> starts? We will use a self-delimiting code for <M>: two bits “00” for ‘zero’, two bits “11” for ‘one’, and “01” for ‘end of string’. 相当于编码为4进位,可计算理论 2011-6-16 CS_Dept.Sichaun Univ. 18/77

唐常杰翻译的计算理论导引PPT

Description Length of <M,y>For the encoding of M and x we concatenate the self-delimiting/double bit description of M with y. 为什么选最小描述? 1. 无最长,2. 较长的不惟一, 3. 最短的是唯一的 Hence from now on: <M,y> = <M>y. For the length of <M,y> this implies: |<M,y>| = |<M>| + |y|如果 M 变化 ,则标准 不统一

Note that the y∈{0,1}* is encoded trivially.可计算理论 2011-6-16 CS_Dept.Sichaun Univ. 19/77

唐常杰翻译的计算理论导引PPT

Description Length of <M,y>For the encoding of M and x we concatenate the self-delimiting/double bit description of M with y. 为什么选极小描述? (1). 无最长描述,(2). 较长的不 惟一,(3). 最短的是唯一的 有一个公理(策默骆)自然数的子集中必有最小数 Hence from now on: <M,y> = <M>y. For the length of <M,y> this implies: |<M,y>| = |<M>| + |y| 直观解释: 解码机长 +密码长 =复杂度, Note that the y∈{0,1}* is encoded trivially.

如果 M 变化, 则 用于比较的标 准 不统一

可计算理论

2011-6-16

CS_Dept.Sichaun Univ.

20/77

唐常杰翻译的计算理论导引PPT

Minimum Description x(Fix a universal Turing machine U.) (例如固定为WinZip.exe) The minimal description d(x) is the shortest string <M>y such that U on <M>y outputs x.用|d(x)| 描述X的复杂度

The length |d(x)| will be the complexity of x…

可计算理论

2011-6-16

CS_Dept.Sichaun Univ.

21/77

本文来源:https://www.bwwdw.com/article/ttzm.html

Top