Tensorflow - 学习笔记 - 20160606
更新时间:2024-04-21 18:11:01 阅读量: 综合文库 文档下载
变量:创建、初始化、保存和加载
目录
创建 .......................................................................................................................................... 4 初始化 ...................................................................................................................................... 4
由另一个变量初始化 ...................................................................................................... 5 自定义初始化 .................................................................................................................. 5 保存和加载 .............................................................................................................................. 5
检查点文件 ...................................................................................................................... 5 保存变量 .......................................................................................................................... 5 恢复变量 .......................................................................................................................... 6 选择存储和恢复哪些变量 .............................................................................................. 6
数据读取 .......................................................................................................................................... 7
目录 .......................................................................................................................................... 7
数据读取 .......................................................................................................................... 7 供给数据 .................................................................................................................................. 7 从文件读取数据 ...................................................................................................................... 8
文件名, 乱序(shuffling), 和最大训练迭代数(epoch limits) ...................................... 8 文件格式 .......................................................................................................................... 8 预处理 ............................................................................................................................ 10 批处理 ............................................................................................................................ 10 创建线程并使用QueueRunner对象来预取 ............................................................. 11 筛选记录或产生每个记录的多个样本 ....................................................................... 13 稀疏输入数据 ................................................................................................................ 13 预取数据 ................................................................................................................................ 13 多输入管道 ............................................................................................................................ 14 Tensor Transformations .................................................................................................................. 15
Contents .................................................................................................................................. 15
Tensor Transformations .................................................................................................. 15 Casting .................................................................................................................................... 15
tf.string_to_number(string_tensor, out_type=None, name=None) ............................... 15 tf.to_double(x, name='ToDouble') ................................................................................. 15 tf.to_float(x, name='ToFloat') ........................................................................................ 16 tf.to_bfloat16(x, name='ToBFloat16')............................................................................ 16 tf.to_int32(x, name='ToInt32') ....................................................................................... 16 tf.to_int64(x, name='ToInt64') ....................................................................................... 16 tf.cast(x, dtype, name=None) ......................................................................................... 16 Shapes and Shaping ................................................................................................................ 17
tf.shape(input, name=None) ........................................................................................... 17 tf.size(input, name=None) .............................................................................................. 17 tf.rank(input, name=None) ............................................................................................. 17 tf.reshape(tensor, shape, name=None) ........................................................................... 18
1
tf.squeeze(input, squeeze_dims=None, name=None) ................................................... 19 tf.expand_dims(input, dim, name=None) ...................................................................... 19 Slicing and Joining ................................................................................................................. 20
tf.slice(input_, begin, size, name=None) ....................................................................... 20 tf.split(split_dim, num_split, value, name='split') ......................................................... 20 tf.tile(input, multiples, name=None) .............................................................................. 21 tf.pad(input, paddings, name=None) ............................................................................. 21 tf.concat(concat_dim, values, name='concat') ............................................................... 22 tf.pack(values, name='pack') .......................................................................................... 22 tf.unpack(value, num=None, name='unpack') ............................................................... 23 tf.reverse_sequence(input, seq_lengths, seq_dim, name=None) .................................. 23 tf.reverse(tensor, dims, name=None) ............................................................................. 24 tf.transpose(a, perm=None, name='transpose') .............................................................. 25 tf.gather(params, indices, name=None) ......................................................................... 26 tf.dynamic_partition(data, partitions, num_partitions, name=None) ............................ 26 tf.dynamic_stitch(indices, data, name=None) ............................................................... 27
Variables ......................................................................................................................................... 28
Contents .................................................................................................................................. 28
Variables ......................................................................................................................... 28 Variables ................................................................................................................................. 29
class tf.Variable .............................................................................................................. 29 Variable helper functions ........................................................................................................ 33
tf.all_variables() ............................................................................................................. 34 tf.trainable_variables() ................................................................................................... 34 tf.initialize_all_variables() ............................................................................................. 34 tf.initialize_variables(var_list, name='init') ............................................. 34 tf.assert_variables_initialized(var_list=None) ............................................................... 34 Saving and Restoring Variables ............................................................................................. 35
class tf.train.Saver .......................................................................................................... 35 tf.train.latest_checkpoint(checkpoint_dir, latest_filename=None) ............................... 38 tf.train.get_checkpoint_state(checkpoint_dir, latest_filename=None) ......................... 38 tf.train.update_checkpoint_state(save_dir, model_checkpoint_path, all_model_checkpoint_paths=None, latest_filename=None) ....................................... 38 Sharing Variables .................................................................................................................... 39
tf.get_variable(name, shape=None, dtype=tf.float32, initializer=None, trainable=True, collections=None) .......................................................................................................... 39 tf.get_variable_scope() ................................................................................................... 40 tf.variable_scope(name_or_scope, reuse=None, initializer=None) .............................. 40 tf.constant_initializer(value=0.0) ................................................................................... 41 tf.random_normal_initializer(mean=0.0, stddev=1.0, seed=None) .............................. 41 tf.truncated_normal_initializer(mean=0.0, stddev=1.0, seed=None) ........................... 41 tf.random_uniform_initializer(minval=0.0, maxval=1.0, seed=None) ......................... 41 tf.uniform_unit_scaling_initializer(factor=1.0, seed=None) ........................................ 42 tf.zeros_initializer(shape, dtype=tf.float32) .................................................................. 42
2
Sparse Variable Updates ......................................................................................................... 42
tf.scatter_update(ref, indices, updates, use_locking=None, name=None) ................... 42 tf.scatter_add(ref, indices, updates, use_locking=None, name=None) ........................ 43 tf.scatter_sub(ref, indices, updates, use_locking=None, name=None) ......................... 44 tf.sparse_mask(a, mask_indices, name=None) .............................................................. 45 class tf.IndexedSlices ..................................................................................................... 45
Class tensorflow::Session............................................................................................................. 46
Member Summary ................................................................................................................ 47 Member Details ..................................................................................................................... 47 Session management .............................................................................................................. 48
class tf.Session ................................................................................................................ 48
3
当训练模型时,用变量来存储和更新参数。变量包含张量 (Tensor)存放于内存的缓存区。建模时它们需要被明确地初始化,模型训练后它们必须被存储到磁盘。这些变量的值 可在之后模型训练和分析是被加载。
本文档描述以下两个TensorFlow类。点击以下链接可查看完整的API文档: ? tf.Variable 类 ? tf.train.Saver 类
创建
当创建一个变量时,你将一个张量作为初始值传入构造函数Variable()。TensorFlow提供了一系列操作符来初始化张量,初始值是常量或是随机值。
注意,所有这些操作符都需要你指定张量的shape。那个形状自动成为变量的shape。变量的shape通常是固定的,但TensorFlow提供了高级的机制来重新调整其行列数。 # Create two variables.
weights = tf.Variable(tf.random_normal([784, 200], stddev=0.35), name=\
biases = tf.Variable(tf.zeros([200]), name=\
调用tf.Variable()添加一些操作(Op, operation)到graph: ? 一个Variable操作存放变量的值。 ? 一个初始化op将变量设置为初始值。这事实上是一个tf.assign操作. ? 初始值的操作,例如示例中对biases变量的zeros操作也被加入了graph。 tf.Variable的返回值是Python的tf.Variable类的一个实例。
初始化
变量的初始化必须在模型的其它操作运行之前先明确地完成。最简单的方法就是添加一个给所有变量初始化的操作,并在使用模型之前首先运行那个操作。
你或者可以从检查点文件中重新获取变量值,详见下文。
使用tf.initialize_all_variables()添加一个操作对变量做初始化。记得在完全构建好模型并加载之后再运行那个操作。 # Create two variables.
weights = tf.Variable(tf.random_normal([784, 200], stddev=0.35), name=\
biases = tf.Variable(tf.zeros([200]), name=\...
# Add an op to initialize the variables. init_op = tf.initialize_all_variables()
# Later, when launching the model with tf.Session() as sess: # Run the init operation. sess.run(init_op) ...
# Use the model
...
4
由另一个变量初始化
你有时候会需要用另一个变量的初始化值给当前变量初始化。由于tf.initialize_all_variables()是并行地初始化所有变量,所以在有这种需求的情况下需要小心。 用其它变量的值初始化一个新的变量时,使用其它变量的initialized_value()属性。你可以直接把已初始化的值作为新变量的初始值,或者把它当做tensor计算得到一个值赋予新变量。
# Create a variable with a random value.
weights = tf.Variable(tf.random_normal([784, 200], stddev=0.35), name=\
# Create another variable with the same value as 'weights'. w2 = tf.Variable(weights.initialized_value(), name=\# Create another variable with twice the value of 'weights'
w_twice = tf.Variable(weights.initialized_value() * 0.2, name=\
自定义初始化
tf.initialize_all_variables()函数便捷地添加一个op来初始化模型的所有变量。你也可以给它传入一组变量进行初始化。详情请见Variables Documentation,包括检查变量是否被初始化。
保存和加载
最简单的保存和恢复模型的方法是使用tf.train.Saver对象。构造器给graph的所有变量,或是定义在列表里的变量,添加save和restoreops。saver对象提供了方法来运行这些ops,定义检查点文件的读写路径。
检查点文件
变量存储在二进制文件里,主要包含从变量名到tensor值的映射关系。
当你创建一个Saver对象时,你可以选择性地为检查点文件中的变量挑选变量名。默认情况下,将每个变量Variable.name属性的值。
保存变量
用tf.train.Saver()创建一个Saver来管理模型中的所有变量。 # Create some variables.
v1 = tf.Variable(..., name=\v2 = tf.Variable(..., name=\...
# Add an op to initialize the variables. init_op = tf.initialize_all_variables()
# Add ops to save and restore all the variables. saver = tf.train.Saver()
# Later, launch the model, initialize the variables, do some work, save the # variables to disk.
5
with tf.Session() as sess: sess.run(init_op)
# Do some work with the model. ..
# Save the variables to disk.
save_path = saver.save(sess, \ print \
恢复变量
用同一个Saver对象来恢复变量。注意,当你从文件中恢复变量时,不需要事先对它们做初始化。
# Create some variables.
v1 = tf.Variable(..., name=\v2 = tf.Variable(..., name=\...
# Add ops to save and restore all the variables. saver = tf.train.Saver()
# Later, launch the model, use the saver to restore variables from disk, and # do some work with the model. with tf.Session() as sess:
# Restore variables from disk.
saver.restore(sess, \ print \
# Do some work with the model
...
选择存储和恢复哪些变量
如果你不给tf.train.Saver()传入任何参数,那么saver将处理graph中的所有变量。其中每一个变量都以变量创建时传入的名称被保存。
有时候在检查点文件中明确定义变量的名称很有用。举个例子,你也许已经训练得到了一个模型,其中有个变量命名为\,你想把它的值恢复到一个新的变量\中。
有时候仅保存和恢复模型的一部分变量很有用。再举个例子,你也许训练得到了一个5层神经网络,现在想训练一个6层的新模型,可以将之前5层模型的参数导入到新模型的前5层中。
你可以通过给tf.train.Saver()构造函数传入Python字典,很容易地定义需要保持的变量及对应名称:键对应使用的名称,值对应被管理的变量。 注意: ? 如果需要保存和恢复模型变量的不同子集,可以创建任意多个saver对象。同一个变量可被列入多个saver对象中,只有当saver的restore()函数被运行时,它的值才会发生改变。 ? 如果你仅在session开始时恢复模型变量的一个子集,你需要对剩下的变量执行初始化op。详情请见tf.initialize_variables()。
6
# Create some variables.
v1 = tf.Variable(..., name=\v2 = tf.Variable(..., name=\...
# Add ops to save and restore only 'v2' using the name \saver = tf.train.Saver({\# Use the saver object normally after that.
...
数据读取
TensorFlow程序读取数据一共有3种方法:
? ? ?
供给数据(Feeding): 在TensorFlow程序运行的每一步, 让Python代码来供给数据。 从文件读取数据: 在TensorFlow图的起始, 让一个输入管线从文件中读取数据。
预加载数据: 在TensorFlow图中定义常量或变量来保存所有数据(仅适用于数据量比较小的情况)。
目录 数据读取
? ? o o o o o o o ? ?
供给数据(Feeding) 从文件读取数据
文件名, 乱序(shuffling), 和最大训练迭代数(epoch limits) 文件格式 预处理 批处理
使用QueueRunner创建预读线程
对记录进行过滤或者为每个纪录创建多个样本 序列化输入数据(Sparse input data) 预加载数据 多管线输入
供给数据
TensorFlow的数据供给机制允许你在TensorFlow运算图中将数据注入到任一张量中。因此,python运算可以把数据直接设置到TensorFlow图中。
通过给run()或者eval()函数输入feed_dict参数, 可以启动运算过程。 with tf.Session():
input = tf.placeholder(tf.float32) classifier = ...
print classifier.eval(feed_dict={input: my_python_preprocessing_fn()})
虽然你可以使用常量和变量来替换任何一个张量, 但是最好的做法应该是使用placeholder op节点。设计placeholder节点的唯一的意图就是为了提供数据供给(feeding)的方法。placeholder节点被声明的时候是未初始化的, 也不包含数据, 如果没有为它供给数据, 则TensorFlow运算的时候会产生错误, 所以千万不要忘了为placeholder提供数据。
7
可以在tensorflow/g3doc/tutorials/mnist/fully_connected_feed.py找到使用placeholder和MNIST训练的例子,MNIST tutorial也讲述了这一例子。
从文件读取数据
一共典型的文件读取管线会包含下面这些步骤: 1. 2. 3. 4. 5. 6. 7. 8.
文件名列表
可配置的 文件名乱序(shuffling) 可配置的 最大训练迭代数(epoch limit)
文件名队列
针对输入文件格式的阅读器 纪录解析器
可配置的预处理器
样本队列
文件名, 乱序(shuffling), 和最大训练迭代数(epoch limits)
可以使用字符串张量(比如[\, [(\i) for i in range(2)]) 或者tf.train.match_filenames_once 函数来产生文件名列表。
将文件名列表交给tf.train.string_input_producer 函数.string_input_producer来生成一个先入先出的队列, 文件阅读器会需要它来读取数据。
string_input_producer 提供的可配置参数来设置文件名乱序和最大的训练迭代数, QueueRunner会为每次迭代(epoch)将所有的文件名加入文件名队列中, 如果
shuffle=True的话, 会对文件名进行乱序处理。这一过程是比较均匀的,因此它可以产生均衡的文件名队列。
这个QueueRunner的工作线程是独立于文件阅读器的线程, 因此乱序和将文件名推入到文件名队列这些过程不会阻塞文件阅读器运行。
文件格式
根据你的文件格式, 选择对应的文件阅读器, 然后将文件名队列提供给阅读器的read方法。阅读器的read方法会输出一个key来表征输入的文件和其中的纪录(对于调试非常有用),同时得到一个字符串标量, 这个字符串标量可以被一个或多个解析器,或者转换操作将其解码为张量并且构造成为样本。
CSV 文件
从CSV文件中读取数据, 需要使用TextLineReader和decode_csv 操作, 如下面的例子所示:
filename_queue = tf.train.string_input_producer([\
reader = tf.TextLineReader()
key, value = reader.read(filename_queue)
# Default values, in case of empty columns. Also specifies the type of the # decoded result.
8
record_defaults = [[1], [1], [1], [1], [1]] col1, col2, col3, col4, col5 = tf.decode_csv( value, record_defaults=record_defaults) features = tf.concat(0, [col1, col2, col3, col4])
with tf.Session() as sess:
# Start populating the filename queue. coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(coord=coord)
for i in range(1200):
# Retrieve a single instance:
example, label = sess.run([features, col5])
coord.request_stop() coord.join(threads)
每次read的执行都会从文件中读取一行内容, decode_csv 操作会解析这一行内容并将其转为张量列表。如果输入的参数有缺失,record_default参数可以根据张量的类型来设置默认值。
在调用run或者eval去执行read之前, 你必须调用tf.train.start_queue_runners来将文件名填充到队列。否则read操作会被阻塞到文件名队列中有值为止。
固定长度的记录
从二进制文件中读取固定长度纪录, 可以使用tf.FixedLengthRecordReader的tf.decode_raw操作。decode_raw操作可以讲一个字符串转换为一个uint8的张量。
举例来说,the CIFAR-10 dataset的文件格式定义是:每条记录的长度都是固定的,一个字节的标签,后面是3072字节的图像数据。uint8的张量的标准操作就可以从中获取图像片并且根据需要进行重组。 例子代码可以在
tensorflow/models/image/cifar10/cifar10_input.py找到,具体讲述可参见教程.
标准TensorFlow格式
另一种保存记录的方法可以允许你讲任意的数据转换为TensorFlow所支持的格式, 这种方法可以使TensorFlow的数据集更容易与网络应用架构相匹配。这种建议的方法就是使用TFRecords文件,TFRecords文件包含了tf.train.Example 协议内存块(protocol
buffer)(协议内存块包含了字段 Features)。你可以写一段代码获取你的数据, 将数据填入到Example协议内存块(protocol buffer),将协议内存块序列化为一个字符串, 并且通过tf.python_io.TFRecordWriter class写入到TFRecords文件。
tensorflow/g3doc/how_tos/reading_data/convert_to_records.py就是这样的一个例子。 从TFRecords文件中读取数据, 可以使用tf.TFRecordReader的tf.parse_single_example解析器。这个parse_single_example操作可以将Example协议内存块(protocol buffer)解析为张量。 MNIST的例子就使用了convert_to_records 所构建的数据。 请参看
tensorflow/g3doc/how_tos/reading_data/fully_connected_reader.py, 您也可以将这个例子跟fully_connected_feed的版本加以比较。
9
预处理
你可以对输入的样本进行任意的预处理, 这些预处理不依赖于训练参数, 你可以在tensorflow/models/image/cifar10/cifar10.py找到数据归一化, 提取随机数据片,增加噪声或失真等等预处理的例子。
批处理
在数据输入管线的末端, 我们需要有另一个队列来执行输入样本的训练,评价和推理。因此我们使用tf.train.shuffle_batch 函数来对队列中的样本进行乱序处理 示例:
def read_my_file_format(filename_queue): reader = tf.SomeReader()
key, record_string = reader.read(filename_queue) example, label = tf.some_decoder(record_string) processed_example = some_processing(example) return processed_example, label
def input_pipeline(filenames, batch_size, num_epochs=None): filename_queue = tf.train.string_input_producer(
filenames, num_epochs=num_epochs, shuffle=True) example, label = read_my_file_format(filename_queue)
# min_after_dequeue defines how big a buffer we will randomly sample # from -- bigger means better shuffling but slower start up and more # memory used.
# capacity must be larger than min_after_dequeue and the amount larger # determines the maximum we will prefetch. Recommendation:
# min_after_dequeue + (num_threads + a small safety margin) * batch_size min_after_dequeue = 10000
capacity = min_after_dequeue + 3 * batch_size
example_batch, label_batch = tf.train.shuffle_batch(
[example, label], batch_size=batch_size, capacity=capacity, min_after_dequeue=min_after_dequeue) return example_batch, label_batch
如果你需要对不同文件中的样子有更强的乱序和并行处理,可以使用tf.train.shuffle_batch_join 函数. 示例: def read_my_file_format(filename_queue): # Same as above
def input_pipeline(filenames, batch_size, read_threads, num_epochs=None): filename_queue = tf.train.string_input_producer(
filenames, num_epochs=num_epochs, shuffle=True) example_list = [read_my_file_format(filename_queue) for _ in range(read_threads)] min_after_dequeue = 10000
10
capacity = min_after_dequeue + 3 * batch_size
example_batch, label_batch = tf.train.shuffle_batch_join( example_list, batch_size=batch_size, capacity=capacity, min_after_dequeue=min_after_dequeue) return example_batch, label_batch
在这个例子中, 你虽然只使用了一个文件名队列, 但是TensorFlow依然能保证多个文件阅读器从同一次迭代(epoch)的不同文件中读取数据,知道这次迭代的所有文件都被开始读取为止。(通常来说一个线程来对文件名队列进行填充的效率是足够的)
另一种替代方案是: 使用tf.train.shuffle_batch 函数,设置num_threads的值大于1。 这种方案可以保证同一时刻只在一个文件中进行读取操作(但是读取速度依然优于单线程),而不是之前的同时读取多个文件。这种方案的优点是:
? ?
避免了两个不同的线程从同一个文件中读取同一个样本。 避免了过多的磁盘搜索操作。
你一共需要多少个读取线程呢? 函数tf.train.shuffle_batch*为TensorFlow图提供了获取文件名队列中的元素个数之和的方法。 如果你有足够多的读取线程, 文件名队列中的元素个数之和应该一直是一个略高于0的数。具体可以参考TensorBoard:可视化学习.
创建线程并使用QueueRunner对象来预取
简单来说:使用上面列出的许多tf.train函数添加QueueRunner到你的数据流图中。在你运行任何训练步骤之前,需要调用tf.train.start_queue_runners函数,否则数据流图将一直挂起。tf.train.start_queue_runners 这个函数将会启动输入管道的线程,填充样本到队列中,以便出队操作可以从队列中拿到样本。这种情况下最好配合使用一个
tf.train.Coordinator,这样可以在发生错误的情况下正确地关闭这些线程。如果你对训练迭代数做了限制,那么需要使用一个训练迭代数计数器,并且需要被初始化。推荐的代码模板如下:
# Create the graph, etc.
init_op = tf.initialize_all_variables()
# Create a session for running operations in the Graph. sess = tf.Session()
# Initialize the variables (like the epoch counter). sess.run(init_op)
# Start input enqueue threads. coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(sess=sess, coord=coord) try:
while not coord.should_stop():
# Run training steps or whatever sess.run(train_op)
except tf.errors.OutOfRangeError:
print 'Done training -- epoch limit reached' finally:
# When done, ask the threads to stop.
11
coord.request_stop()
# Wait for threads to finish. coord.join(threads) sess.close()
疑问: 这是怎么回事?
首先,我们先创建数据流图,这个数据流图由一些流水线的阶段组成,阶段间用队列连接在一起。第一阶段将生成文件名,我们读取这些文件名并且把他们排到文件名队列中。第二阶段从文件中读取数据(使用Reader),产生样本,而且把样本放在一个样本队列中。根据你的设置,实际上也可以拷贝第二阶段的样本,使得他们相互独立,这样就可以从多个文件中并行读取。在第二阶段的最后是一个排队操作,就是入队到队列中去,在下一阶段出队。因为我们是要开始运行这些入队操作的线程,所以我们的训练循环会使得样本队列中的样本不断地出队。
在tf.train中要创建这些队列和执行入队操作,就要添加tf.train.QueueRunner到一个使用tf.train.add_queue_runner函数的数据流图中。每个QueueRunner负责一个阶段,处理那些需要在线程中运行的入队操作的列表。一旦数据流图构造成功,tf.train.start_queue_runners函数就会要求数据流图中每个QueueRunner去开始它的线程运行入队操作。
如果一切顺利的话,你现在可以执行你的训练步骤,同时队列也会被后台线程来填充。如果您设置了最大训练迭代数,在某些时候,样本出队的操作可能会得到一个tf.OutOfRangeError的错误。这其实是TensorFlow的“文件结束”(EOF) ———— 这就意味着已经达到了最大训练迭代数,已经没有更多可用的样本了。
最后一个因素是Coordinator。这是负责在收到任何关闭信号的时候,让所有的线程都知道。最常用的是在发生异常时这种情况就会呈现出来,比如说其中一个线程在运行某些操作时出现错误(或一个普通的Python异常)。 想要了解更多的关于threading, queues, QueueRunners, and Coordinators的内容可以看这里.
疑问: 在达到最大训练迭代数的时候如何清理关闭线程?
想象一下,你有一个模型并且设置了最大训练迭代数。这意味着,生成文件的那个线程将只会在产生OutOfRange错误之前运行许多次。该QueueRunner会捕获该错误,并且关闭文件名的队列,最后退出线程。关闭队列做了两件事情: ? 如果还试着对文件名队列执行入队操作时将发生错误。任何线程不应该尝试去这样做,但是当队列因为其他错误而关闭时,这就会有用了。
12
任何当前或将来出队操作要么成功(如果队列中还有足够的元素)或立即失败(发生OutOfRange错误)。它们不会防止等待更多的元素被添加到队列中,因为上面的一点已经保证了这种情况不会发生。
关键是,当在文件名队列被关闭时候,有可能还有许多文件名在该队列中,这样下一阶段的流水线(包括reader和其它预处理)还可以继续运行一段时间。 一旦文件名队列空了之后,如果后面的流水线还要尝试从文件名队列中取出一个文件名(例如,从一个已经处理完文件的reader中),这将会触发OutOfRange错误。在这种情况下,即使你可能有一个QueueRunner关联着多个线程。如果这不是在QueueRunner中的最后那个线程,OutOfRange错误仅仅只会使得一个线程退出。这使得其他那些正处理自己的最后一个文件的线程继续运行,直至他们完成为止。 (但如果假设你使用的是tf.train.Coordinator,其他类型的错误将导致所有线程停止)。一旦所有的reader线程触发OutOfRange错误,然后才是下一个队列,再是样本队列被关闭。
同样,样本队列中会有一些已经入队的元素,所以样本训练将一直持续直到样本队列中再没有样本为止。如果样本队列是一个RandomShuffleQueue,因为你使用了shuffle_batch 或者 shuffle_batch_join,所以通常不会出现以往那种队列中的元素会比min_after_dequeue 定义的更少的情况。 然而,一旦该队列被关闭,min_after_dequeue设置的限定值将失效,最终队列将为空。在这一点来说,当实际训练线程尝试从样本队列中取出数据时,将会触发OutOfRange错误,然后训练线程会退出。一旦所有的培训线程完成,tf.train.Coordinator.join会返回,你就可以正常退出了。
?
筛选记录或产生每个记录的多个样本
举个例子,有形式为[x, y, z]的样本,我们可以生成一批形式为[batch, x, y, z]的样本。 如果你想滤除这个记录(或许不需要这样的设置),那么可以设置batch的大小为0;但如果你需要每个记录产生多个样本,那么batch的值可以大于1。 然后很简单,只需调用批处理函数(比如: shuffle_batch orshuffle_batch_join)去设置enqueue_many=True就可以实现。
稀疏输入数据
SparseTensors这种数据类型使用队列来处理不是太好。如果要使用
SparseTensors你就必须在批处理之后使用tf.parse_example 去解析字符串记录 (而不是在批处理之前使用 tf.parse_single_example) 。
预取数据
这仅用于可以完全加载到存储器中的小的数据集。有两种方法: ? 存储在常数中。 ? 存储在变量中,初始化后,永远不要改变它的值。
使用常数更简单一些,但是会使用更多的内存(因为常数会内联的存储在数据流图数据结构中,这个结构体可能会被复制几次)。
training_data = ... training_labels = ... with tf.Session():
input_data = tf.constant(training_data)
13
input_labels = tf.constant(training_labels) ...
要改为使用变量的方式,您就需要在数据流图建立后初始化这个变量。
training_data = ... training_labels = ... with tf.Session() as sess:
data_initializer = tf.placeholder(dtype=training_data.dtype,
shape=training_data.shape) label_initializer = tf.placeholder(dtype=training_labels.dtype,
shape=training_labels.shape) input_data = tf.Variable(data_initalizer, trainable=False, collections=[]) input_labels = tf.Variable(label_initalizer, trainable=False, collections=[]) ...
sess.run(input_data.initializer,
feed_dict={data_initializer: training_data}) sess.run(input_labels.initializer,
feed_dict={label_initializer: training_lables})
设定trainable=False 可以防止该变量被数据流图
的 GraphKeys.TRAINABLE_VARIABLES 收集, 这样我们就不会在训练的时候尝试更新它的值; 设定 collections=[] 可以防止GraphKeys.VARIABLES 收集后做为保存和恢复的中断点。
无论哪种方式,tf.train.slice_input_producer function函数可以被用来每次产生一个切片。这样就会让样本在整个迭代中被打乱,所以在使用批处理的时候不需要再次打乱样本。所以我们不使用shuffle_batch函数,取而代之的是纯tf.train.batch 函数。 如果要使用多个线程进行预处理,需要将num_threads参数设置为大于1的数字。
在tensorflow/g3doc/how_tos/reading_data/fully_connected_preloaded.py 中可以找到一个MNIST例子,使用常数来预加载。 另外使用变量来预加载的例子在
tensorflow/g3doc/how_tos/reading_data/fully_connected_preloaded_var.py,你可以用上面fully_connected_feed 和 fully_connected_reader 的描述来进行比较。
多输入管道
通常你会在一个数据集上面训练,然后在另外一个数据集上做评估计算(或称为 \。 这样做的一种方法是,实际上包含两个独立的进程: ? 训练过程中读取输入数据,并定期将所有的训练的变量写入还原点文件)。 ? 在计算过程中恢复还原点文件到一个推理模型中,读取有效的输入数据。 这两个进程在下面的例子中已经完成了:the example CIFAR-10 model,有以下几个好处: ? eval被当做训练后变量的一个简单映射。 ? 你甚至可以在训练完成和退出后执行eval。
您可以在同一个进程的相同的数据流图中有训练和eval,并分享他们的训练后的变量。参考the shared variables tutorial.
14
Tensor Transformations
Note: Functions taking Tensor arguments can also take anything accepted by tf.convert_to_tensor. Contents
Tensor Transformations ? Casting o tf.string_to_number(string_tensor, out_type=None, name=None) o tf.to_double(x, name='ToDouble') o tf.to_float(x, name='ToFloat') o tf.to_bfloat16(x, name='ToBFloat16') o tf.to_int32(x, name='ToInt32') o tf.to_int64(x, name='ToInt64') o tf.cast(x, dtype, name=None) ? Shapes and Shaping o tf.shape(input, name=None) o tf.expand_dims(input, dim, name=None) ? Slicing and Joining o tf.slice(input_, begin, size, name=None) o tf.dynamic_stitch(indices, data, name=None) Casting
TensorFlow provides several operations that you can use to cast tensor data types in your graph.
tf.string_to_number(string_tensor, out_type=None, name=None) Converts each string in the input Tensor to the specified numeric type.
(Note that int32 overflow results in an error while float overflow results in a rounded value.) Args: ? string_tensor: A Tensor of type string. ? out_type: An optional tf.DType from: tf.float32, tf.int32. Defaults to tf.float32. The numeric type to interpret each string in string_tensor as. ? name: A name for the operation (optional). Returns:
A Tensor of type out_type. A Tensor of the same shape as the input string_tensor.
tf.to_double(x, name='ToDouble') Casts a tensor to type float64. Args: ? x: A Tensor or SparseTensor. ? name: A name for the operation (optional). Returns:
A Tensor or SparseTensor with same shape as x with type float64. Raises: ? TypeError: If x cannot be cast to the float64.
15
tf.to_float(x, name='ToFloat') Casts a tensor to type float32. Args: ? x: A Tensor or SparseTensor. ? name: A name for the operation (optional). Returns:
A Tensor or SparseTensor with same shape as x with type float32. Raises: ? TypeError: If x cannot be cast to the float32.
tf.to_bfloat16(x, name='ToBFloat16') Casts a tensor to type bfloat16. Args: ? x: A Tensor or SparseTensor. ? name: A name for the operation (optional). Returns:
A Tensor or SparseTensor with same shape as x with type bfloat16. Raises: ? TypeError: If x cannot be cast to the bfloat16.
tf.to_int32(x, name='ToInt32') Casts a tensor to type int32. Args: ? x: A Tensor or SparseTensor. ? name: A name for the operation (optional). Returns:
A Tensor or SparseTensor with same shape as x with type int32. Raises: ? TypeError: If x cannot be cast to the int32.
tf.to_int64(x, name='ToInt64') Casts a tensor to type int64. Args: ? x: A Tensor or SparseTensor. ? name: A name for the operation (optional). Returns:
A Tensor or SparseTensor with same shape as x with type int64. Raises: ? TypeError: If x cannot be cast to the int64.
tf.cast(x, dtype, name=None) Casts a tensor to a new type.
The operation casts x (in case of Tensor) or x.values (in case of SparseTensor) to dtype. For example:
16
# tensor `a` is [1.8, 2.2], dtype=tf.float
tf.cast(a, tf.int32) ==> [1, 2] # dtype=tf.int32 Args: ? x: A Tensor or SparseTensor. ? dtype: The destination type. ? name: A name for the operation (optional). Returns:
A Tensor or SparseTensor with same shape as x. Raises: ? TypeError: If x cannot be cast to the dtype. Shapes and Shaping
TensorFlow provides several operations that you can use to determine the shape of a tensor and change the shape of a tensor.
tf.shape(input, name=None) Returns the shape of a tensor.
This operation returns a 1-D integer tensor representing the shape of input. For example:
# 't' is [[[1, 1, 1], [2, 2, 2]], [[3, 3, 3], [4, 4, 4]]] shape(t) ==> [2, 2, 3] Args: ? input: A Tensor. ? name: A name for the operation (optional). Returns:
A Tensor of type int32.
tf.size(input, name=None) Returns the size of a tensor.
This operation returns an integer representing the number of elements in input. For example:
# 't' is [[[1, 1,, 1], [2, 2, 2]], [[3, 3, 3], [4, 4, 4]]]] size(t) ==> 12 Args: ? input: A Tensor. ? name: A name for the operation (optional). Returns:
A Tensor of type int32.
tf.rank(input, name=None) Returns the rank of a tensor.
This operation returns an integer representing the rank of input. For example:
# 't' is [[[1, 1, 1], [2, 2, 2]], [[3, 3, 3], [4, 4, 4]]] # shape of tensor 't' is [2, 2, 3]
17
rank(t) ==> 3
Note: The rank of a tensor is not the same as the rank of a matrix. The rank of a tensor is the number of indices required to uniquely select each element of the tensor. Rank is also known as \Args: ? input: A Tensor. ? name: A name for the operation (optional). Returns:
A Tensor of type int32.
tf.reshape(tensor, shape, name=None) Reshapes a tensor.
Given tensor, this operation returns a tensor that has the same values as tensor with shape shape. If shape is the special value [-1], then tensor is flattened and the operation outputs a 1-D tensor with all elements of tensor.
If shape is 1-D or higher, then the operation returns a tensor with shape shape filled with the values oftensor. In this case, the number of elements implied by shape must be the same as the number of elements in tensor. For example:
# tensor 't' is [1, 2, 3, 4, 5, 6, 7, 8, 9] # tensor 't' has shape [9]
reshape(t, [3, 3]) ==> [[1, 2, 3]
[4, 5, 6] [7, 8, 9]]
# tensor 't' is [[[1, 1], [2, 2]]
# [[3, 3], [4, 4]]] # tensor 't' has shape [2, 2]
reshape(t, [2, 4]) ==> [[1, 1, 2, 2]
[3, 3, 4, 4]]
# tensor 't' is [[[1, 1, 1],
# [2, 2, 2]], # [[3, 3, 3], # [4, 4, 4]], # [[5, 5, 5], # [6, 6, 6]]] # tensor 't' has shape [3, 2, 3] # pass '[-1]' to flatten 't'
reshape(t, [-1]) ==> [1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 6, 6, 6] Args: ? tensor: A Tensor. ? shape: A Tensor of type int32. Defines the shape of the output tensor. ? name: A name for the operation (optional).
18
Returns:
A Tensor. Has the same type as tensor.
tf.squeeze(input, squeeze_dims=None, name=None) Removes dimensions of size 1 from the shape of a tensor.
Given a tensor input, this operation returns a tensor of the same type with all dimensions of size 1 removed. If you don't want to remove all size 1 dimensions, you can remove specific size 1 dimensions by specifying squeeze_dims. For example:
# 't' is a tensor of shape [1, 2, 1, 3, 1, 1] shape(squeeze(t)) ==> [2, 3]
Or, to remove specific size 1 dimensions: # 't' is a tensor of shape [1, 2, 1, 3, 1, 1] shape(squeeze(t, [2, 4])) ==> [1, 2, 3, 1] Args: ? input: A Tensor. The input to squeeze. ? squeeze_dims: An optional list of ints. Defaults to []. If specified, only squeezes the dimensions listed. The dimension index starts at 0. It is an error to squeeze a dimension that is not 1. ? name: A name for the operation (optional). Returns:
A Tensor. Has the same type as input. Contains the same data as input, but has one or more dimensions of size 1 removed.
tf.expand_dims(input, dim, name=None) Inserts a dimension of 1 into a tensor's shape.
Given a tensor input, this operation inserts a dimension of 1 at the dimension
index dim of input's shape. The dimension index dim starts at zero; if you specify a negative number for dim it is counted backward from the end.
This operation is useful if you want to add a batch dimension to a single element. For example, if you have a single image of shape [height, width, channels], you can make it a batch of 1 image withexpand_dims(image, 0), which will make the shape [1, height, width, channels]. Other examples:
# 't' is a tensor of shape [2]
shape(expand_dims(t, 0)) ==> [1, 2] shape(expand_dims(t, 1)) ==> [2, 1] shape(expand_dims(t, -1)) ==> [2, 1]
# 't2' is a tensor of shape [2, 3, 5]
shape(expand_dims(t2, 0)) ==> [1, 2, 3, 5] shape(expand_dims(t2, 2)) ==> [2, 3, 1, 5] shape(expand_dims(t2, 3)) ==> [2, 3, 5, 1] This operation requires that:
-1-input.dims() <= dim <= input.dims()
19
This operation is related to squeeze(), which removes dimensions of size 1. Args: ? input: A Tensor. ? dim: A Tensor of type int32. 0-D (scalar). Specifies the dimension index at which to expand the shape of input. ? name: A name for the operation (optional). Returns:
A Tensor. Has the same type as input. Contains the same data as input, but its shape has an additional dimension of size 1 added. Slicing and Joining
TensorFlow provides several operations to slice or extract parts of a tensor, or join multiple tensors together.
tf.slice(input_, begin, size, name=None) Extracts a slice from a tensor.
This operation extracts a slice of size size from a tensor input starting at the location specified bybegin. The slice size is represented as a tensor shape, where size[i] is the number of elements of the 'i'th dimension of input that you want to slice. The starting location (begin) for the slice is represented as an offset in each dimension of input. In other words, begin[i] is the offset into the 'i'th dimension of input that you want to slice from.
begin is zero-based; size is one-based. If size[i] is -1, all remaining elements in dimension i are included in the slice. In other words, this is equivalent to setting: size[i] = input.dim_size(i) - begin[i] This operation requires that:
0 <= begin[i] <= begin[i] + size[i] <= Di for i in [0, n] For example:
# 'input' is [[[1, 1, 1], [2, 2, 2]],
# [[3, 3, 3], [4, 4, 4]], # [[5, 5, 5], [6, 6, 6]]]
tf.slice(input, [1, 0, 0], [1, 1, 3]) ==> [[[3, 3, 3]]] tf.slice(input, [1, 0, 0], [1, 2, 3]) ==> [[[3, 3, 3],
[4, 4, 4]]] tf.slice(input, [1, 0, 0], [2, 1, 3]) ==> [[[3, 3, 3]],
[[5, 5, 5]]] Args: ? input_: A Tensor. ? begin: An int32 or int64 Tensor. ? size: An int32 or int64 Tensor. ? name: A name for the operation (optional). Returns:
A Tensor the same type as input.
tf.split(split_dim, num_split, value, name='split')
Splits a tensor into num_split tensors along one dimension.
20
Splits value along dimension split_dim into num_split smaller tensors. Requires that num_splitevenly divide value.shape[split_dim]. For example:
# 'value' is a tensor with shape [5, 30]
# Split 'value' into 3 tensors along dimension 1 split0, split1, split2 = tf.split(1, 3, value) tf.shape(split0) ==> [5, 10] Args: ? split_dim: A 0-D int32 Tensor. The dimension along which to split. Must be in the range [0, rank(value)). ? num_split: A 0-D int32 Tensor. The number of ways to split. ? value: The Tensor to split. ? name: A name for the operation (optional). Returns:
num_split Tensor objects resulting from splitting value.
tf.tile(input, multiples, name=None)
Constructs a tensor by tiling a given tensor.
This operation creates a new tensor by replicating input multiples times. The output tensor's i'th dimension has input.dims(i) * multiples[i] elements, and the values of input are replicatedmultiples[i] times along the 'i'th dimension. For example, tiling [a b c d] by [2] produces [a b c d a b c d]. Args: ? input: A Tensor. 1-D or higher. ? multiples: A Tensor of type int32. 1-D. Length must be the same as the number of dimensions in input ? name: A name for the operation (optional). Returns:
A Tensor. Has the same type as input.
tf.pad(input, paddings, name=None) Pads a tensor with zeros.
This operation pads a input with zeros according to the paddings you specify. paddings is an integer tensor with shape [Dn, 2], where n is the rank of input. For each dimension D
of input,paddings[D, 0] indicates how many zeros to add before the contents of input in that dimension, andpaddings[D, 1] indicates how many zeros to add after the contents of input in that dimension.
The padded size of each dimension D of the output is: paddings(D, 0) + input.dim_size(D) + paddings(D, 1) For example:
# 't' is [[1, 1], [2, 2]]
# 'paddings' is [[1, 1]], [2, 2]] # rank of 't' is 2
pad(t, paddings) ==> [[0, 0, 0, 0, 0]
21
[0, 0, 0, 0, 0] [0, 1, 1, 0, 0] [[0, 2, 2, 0, 0] [0, 0, 0, 0, 0]] Args: ? input: A Tensor. ? paddings: A Tensor of type int32. ? name: A name for the operation (optional). Returns:
A Tensor. Has the same type as input.
tf.concat(concat_dim, values, name='concat') Concatenates tensors along one dimension.
Concatenates the list of tensors values along dimension concat_dim. If values[i].shape = [D0, D1, ... Dconcat_dim(i), ...Dn], the concatenated result has shape [D0, D1, ... Rconcat_dim, ...Dn] where
Rconcat_dim = sum(Dconcat_dim(i))
That is, the data from the input tensors is joined along the concat_dim dimension. The number of dimensions of the input tensors must match, and all dimensions except concat_dim must be equal. For example:
t1 = [[1, 2, 3], [4, 5, 6]] t2 = [[7, 8, 9], [10, 11, 12]]
tf.concat(0, [t1, t2]) ==> [[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]] tf.concat(1, [t1, t2]) ==> [[1, 2, 3, 7, 8, 9], [4, 5, 6, 10, 11, 12]]
# tensor t3 with shape [2, 3] # tensor t4 with shape [2, 3]
tf.shape(tf.concat(0, [t3, t4])) ==> [4, 3] tf.shape(tf.concat(1, [t3, t4])) ==> [2, 6] Args: ? concat_dim: 0-D int32 Tensor. Dimension along which to concatenate. ? values: A list of Tensor objects or a single Tensor. ? name: A name for the operation (optional). Returns:
A Tensor resulting from concatenation of the input tensors.
tf.pack(values, name='pack')
Packs a list of rank-R tensors into one rank-(R+1) tensor.
Packs tensors in values into a tensor with rank one higher than each tensor in values and shape[len(values)] + values[0].shape. The output satisfies output[i, ...] = values[i][...]. This is the opposite of unpack. The numpy equivalent is tf.pack([x, y, z]) = np.asarray([x, y, z])
22
Args:
values: A list of Tensor objects with the same shape and type.
? name: A name for this operation (optional). Returns: ? output: A packed Tensor with the same type as values.
?
tf.unpack(value, num=None, name='unpack')
Unpacks the outer dimension of a rank-R tensor into rank-(R-1) tensors.
Unpacks num tensors from value along the first dimension. If num is not specified (the default), it is inferred from value's shape. If value.shape[0] is not known, ValueError is raised.
The ith tensor in output is the slice value[i, ...]. Each tensor in output has shapevalue.shape[1:]. This is the opposite of pack. The numpy equivalent is tf.unpack(x, n) = list(x) Args: ? value: A rank R > 0 Tensor to be unpacked. ? num: An int. The first dimension of value. Automatically inferred if None (the default). ? name: A name for the operation (optional). Returns:
The list of Tensor objects unpacked from value. Raises: ? ValueError: If num is unspecified and cannot be inferred.
tf.reverse_sequence(input, seq_lengths, seq_dim, name=None) Reverses variable length slices in dimension seq_dim.
This op first slices input along the first dimension, and for each slice i, reverses the firstseq_lengths[i] elements along the dimension seq_dim.
The elements of seq_lengths must obey seq_lengths[i] < input.dims[seq_dim], and seq_lengthsmust be a vector of length input.dims(0).
The output slice i along dimension 0 is then given by input slice i, with the first seq_lengths[i] slices along dimension seq_dim reversed. For example: # Given this: seq_dim = 1
input.dims = (4, ...)
seq_lengths = [7, 2, 3, 5]
# then slices of input are reversed on seq_dim, but only up to seq_lengths: output[0, 0:7, :, ...] = input[0, 7:0:-1, :, ...] output[1, 0:2, :, ...] = input[1, 2:0:-1, :, ...] output[2, 0:3, :, ...] = input[2, 3:0:-1, :, ...] output[3, 0:5, :, ...] = input[3, 5:0:-1, :, ...]
# while entries past seq_lens are copied through: output[0, 7:, :, ...] = input[0, 7:, :, ...]
23
output[1, 2:, :, ...] = input[1, 2:, :, ...] output[2, 3:, :, ...] = input[2, 3:, :, ...] output[3, 2:, :, ...] = input[3, 2:, :, ...] Args: ? input: A Tensor. The input to reverse. ? seq_lengths: A Tensor of type int64. 1-D with
length input.dims(0) and max(seq_lengths) < input.dims(seq_dim) ? seq_dim: An int. The dimension which is partially reversed. ? name: A name for the operation (optional). Returns:
A Tensor. Has the same type as input. The partially reversed input. It has the same shape as input.
tf.reverse(tensor, dims, name=None) Reverses specific dimensions of a tensor.
Given a tensor, and a bool tensor dims representing the dimensions of tensor, this operation reverses each dimension i of tensor where dims[i] is True.
tensor can have up to 8 dimensions. The number of dimensions of tensor must equal the number of elements in dims. In other words: rank(tensor) = size(dims) For example:
# tensor 't' is [[[[ 0, 1, 2, 3],
# [ 4, 5, 6, 7], # [ 8, 9, 10, 11]], # [[12, 13, 14, 15], # [16, 17, 18, 19], # [20, 21, 22, 23]]]] # tensor 't' shape is [1, 2, 3, 4]
# 'dims' is [False, False, False, True] reverse(t, dims) ==> [[[[ 3, 2, 1, 0],
[ 7, 6, 5, 4], [ 11, 10, 9, 8]], [[15, 14, 13, 12], [19, 18, 17, 16], [23, 22, 21, 20]]]]
# 'dims' is [False, True, False, False] reverse(t, dims) ==> [[[[12, 13, 14, 15],
[16, 17, 18, 19], [20, 21, 22, 23] [[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11]]]]
24
# 'dims' is [False, False, True, False] reverse(t, dims) ==> [[[[8, 9, 10, 11], [4, 5, 6, 7], [0, 1, 2, 3]] [[20, 21, 22, 23], [16, 17, 18, 19], [12, 13, 14, 15]]]] Args: ? tensor: A Tensor. Must be one of the following types: uint8, int8, int32, bool, float32,float64. Up to 8-D. ? dims: A Tensor of type bool. 1-D. The dimensions to reverse. ? name: A name for the operation (optional). Returns:
A Tensor. Has the same type as tensor. The same shape as tensor.
tf.transpose(a, perm=None, name='transpose')
Transposes a. Permutes the dimensions according to perm.
The returned tensor's dimension i will correspond to the input dimension perm[i]. If perm is not given, it is set to (n-1...0), where n is the rank of the input tensor. Hence by default, this operation performs a regular matrix transpose on 2-D input Tensors. For example: # 'x' is [[1 2 3]
# [4 5 6]]
tf.transpose(x) ==> [[1 4]
[2 5] [3 6]]
# Equivalently
tf.transpose(x perm=[0, 1]) ==> [[1 4]
[2 5] [3 6]]
# 'perm' is more useful for n-dimensional tensors, for n > 2 # 'x' is [[[1 2 3]
# [4 5 6]] # [[7 8 9] # [10 11 12]]]
# Take the transpose of the matrices in dimension-0 tf.transpose(b, perm=[0, 2, 1]) ==> [[[1 4]
[2 5] [3 6]]
[[7 10]
25
[8 11] [9 12]]] Args: ? a: A Tensor. ? perm: A permutation of the dimensions of a. ? name: A name for the operation (optional). Returns:
A transposed Tensor.
tf.gather(params, indices, name=None)
Gather slices from params according to indices.
indices must be an integer tensor of any dimension (usually 0-D or 1-D). Produces an output tensor with shape indices.shape + params.shape[1:] where: # Scalar indices
output[:, ..., :] = params[indices, :, ... :]
# Vector indices
output[i, :, ..., :] = params[indices[i], :, ... :]
# Higher rank indices
output[i, ..., j, :, ... :] = params[indices[i, ..., j], :, ..., :]
If indices is a permutation and len(indices) == params.shape[0] then this operation will permuteparams accordingly.
Args:
params: A Tensor.
? indices: A Tensor. Must be one of the following types: int32, int64. ? name: A name for the operation (optional). Returns:
A Tensor. Has the same type as params.
?
tf.dynamic_partition(data, partitions, num_partitions, name=None) Partitions data into num_partitions tensors using indices from partitions.
For each index tuple js of size partitions.ndim, the slice data[js, ...] becomes part ofoutputs[partitions[js]]. The slices with partitions[js] = i are placed in outputs[i] in lexicographic order of js, and the first dimension of outputs[i] is the number of entries in partitionsequal to i. In detail,
outputs[i].shape = [sum(partitions == i)] + data.shape[partitions.ndim:]
26
outputs[i] = pack([data[js, ...] for js if partitions[js] == i]) data.shape must start with partitions.shape. For example:
# Scalar partitions partitions = 1
num_partitions = 2 data = [10, 20]
outputs[0] = [] # Empty with shape [0, 2] outputs[1] = [[10, 20]]
# Vector partitions
partitions = [0, 0, 1, 1, 0] num_partitions = 2
data = [10, 20, 30, 40, 50] outputs[0] = [10, 20, 50] outputs[1] = [30, 40]
Args:
data: A Tensor.
? partitions: A Tensor of type int32. Any shape. Indices in the range [0, num_partitions). ? num_partitions: An int that is >= 1. The number of partitions to output. ? name: A name for the operation (optional). Returns:
A list of num_partitions Tensor objects of the same type as data.
?
tf.dynamic_stitch(indices, data, name=None)
Interleave the values from the data tensors into a single tensor. Builds a merged tensor such that
merged[indices[m][i, ..., j], ...] = data[m][i, ..., j, ...]
For example, if each indices[m] is scalar or vector, we have # Scalar indices
merged[indices[m], ...] = data[m][...]
# Vector indices
merged[indices[m][i], ...] = data[m][i, ...]
Each data[i].shape must start with the corresponding indices[i].shape, and the rest
ofdata[i].shape must be constant w.r.t. i. That is, we must have data[i].shape = indices[i].shape + constant. In terms of this constant, the output shape is merged.shape = [max(indices)] + constant
27
Values are merged in order, so if an index appears in both indices[m][i] and indices[n][j] for(m,i) < (n,j) the slice data[n][j] will appear in the merged result. For example: indices[0] = 6 indices[1] = [4, 1]
indices[2] = [[5, 2], [0, 3]] data[0] = [61, 62]
data[1] = [[41, 42], [11, 12]]
data[2] = [[[51, 52], [21, 22]], [[1, 2], [31, 32]]]
merged = [[1, 2], [11, 12], [21, 22], [31, 32], [41, 42], [51, 52], [61, 62]]
Args:
indices: A list of at least 2 Tensor objects of type int32.
? data: A list with the same number of Tensor objects as indices of Tensor objects of the same type. ? name: A name for the operation (optional). Returns:
A Tensor. Has the same type as data. Variables
Note: Functions taking Tensor arguments can also take anything accepted by tf.convert_to_tensor. Contents Variables ? Variables o class tf.Variable ? Variable helper functions o tf.all_variables() o tf.trainable_variables() o tf.initialize_all_variables() o tf.initialize_variables(var_list, name='init') o tf.assert_variables_initialized(var_list=None) ? Saving and Restoring Variables o class tf.train.Saver o tf.train.latest_checkpoint(checkpoint_dir, latest_filename=None) o tf.train.get_checkpoint_state(checkpoint_dir, latest_filename=None) o tf.train.update_checkpoint_state(save_dir, model_checkpoint_path, all_model_checkpoint_paths=None, latest_filename=None) ? Sharing Variables
?
28
tf.get_variable(name, shape=None, dtype=tf.float32, initializer=None,
trainable=True, collections=None) o tf.get_variable_scope() o tf.variable_scope(name_or_scope, reuse=None, initializer=None) o tf.constant_initializer(value=0.0) o tf.random_normal_initializer(mean=0.0, stddev=1.0, seed=None) o tf.truncated_normal_initializer(mean=0.0, stddev=1.0, seed=None) o tf.random_uniform_initializer(minval=0.0, maxval=1.0, seed=None) o tf.uniform_unit_scaling_initializer(factor=1.0, seed=None) o tf.zeros_initializer(shape, dtype=tf.float32) ? Sparse Variable Updates o tf.scatter_update(ref, indices, updates, use_locking=None, name=None) o tf.scatter_add(ref, indices, updates, use_locking=None, name=None) o tf.scatter_sub(ref, indices, updates, use_locking=None, name=None) o tf.sparse_mask(a, mask_indices, name=None) o class tf.IndexedSlices Variables
o
class tf.Variable
See the Variables How To for a high level overview. A variable maintains state in the graph across calls to run(). You add a variable to the graph by constructing an instance of the class Variable.
The Variable() constructor requires an initial value for the variable, which can be a Tensor of any type and shape. The initial value defines the type and shape of the variable. After
construction, the type and shape of the variable are fixed. The value can be changed using one of the assign methods.
If you want to change the shape of a variable later you have to use an assign Op with validate_shape=False.
Just like any Tensor, variables created with Variable() can be used as inputs for other Ops in the graph. Additionally, all the operators overloaded for the Tensor class are carried over to variables, so you can also add nodes to the graph by just doing arithmetic on variables. import tensorflow as tf
# Create a variable.
w = tf.Variable(
# Use the variable in the graph like any Tensor. y = tf.matmul(w, ...another variable or tensor...)
# The overloaded operators are available too. z = tf.sigmoid(w + b)
# Assign a new value to the variable with `assign()` or a related method. w.assign(w + 1.0)
29
w.assign_add(1.0)
When you launch the graph, variables have to be explicitly initialized before you can run Ops that use their value. You can initialize a variable by running its initializer op, restoring the
variable from a save file, or simply running an assign Op that assigns a value to the variable. In fact, the variable initializer op is just an assign Op that assigns the variable's initial value to the variable itself.
# Launch the graph in a session. with tf.Session() as sess:
# Run the variable initializer. sess.run(w.initializer)
# ...you now can run ops that use the value of 'w'... The most common initialization pattern is to use the convenience function
initialize_all_variables() to add an Op to the graph that initializes all the variables. You then run that Op after launching the graph. # Add an Op to initialize all variables. init_op = tf.initialize_all_variables()
# Launch the graph in a session. with tf.Session() as sess:
# Run the Op that initializes all variables. sess.run(init_op)
# ...you can now run any Op that uses variable values...
If you need to create a variable with an initial value dependent on another variable, use the other variable's initialized_value(). This ensures that variables are initialized in the right order. All variables are automatically collected in the graph where they are created. By default, the constructor adds the new variable to the graph collection GraphKeys.VARIABLES. The convenience function all_variables() returns the contents of that collection.
When building a machine learning model it is often convenient to distinguish betwen variables holding the trainable model parameters and other variables such as a global step variable used to count training steps. To make this easier, the variable constructor supports a trainable=
GraphKeys.TRAINABLE_VARIABLES. The convenience function trainable_variables() returns the contents of this collection. The various Optimizer classes use this collection as the default list of variables to optimize. Creating a variable.
tf.Variable.__init__(initial_value, trainable=True, collections=None, validate_shape=True, name=None)
Creates a new variable with value initial_value.
The new variable is added to the graph collections listed in collections, which defaults to [GraphKeys.VARIABLES].
If trainable is True the variable is also added to the graph collection GraphKeys.TRAINABLE_VARIABLES.
30
This constructor creates both a variable Op and an assign Op to set the variable to its initial value. Args: ? initial_value: A Tensor, or Python object convertible to a Tensor. The initial value for the Variable. Must have a shape specified unless validate_shape is set to False. ? trainable: If True, the default, also adds the variable to the graph collection GraphKeys.TRAINABLE_VARIABLES. This collection is used as the default list of variables to use by the Optimizer classes. ? collections: List of graph collections keys. The new variable is added to these collections. Defaults to [GraphKeys.VARIABLES]. ? validate_shape: If False, allows the variable to be initialized with a value of unknown shape. If True, the default, the shape of initial_value must be known. ? name: Optional name for the variable. Defaults to 'Variable' and gets uniquified automatically. Returns: A Variable. Raises: ? ValueError: If the initial value does not have a shape and validate_shape is True. tf.Variable.initialized_value() Returns the value of the initialized variable. You should use this instead of the variable itself to initialize another variable with a value that depends on the value of this variable. # Initialize 'v' with a random tensor. v = tf.Variable(tf.truncated_normal([10, 40])) # Use `initialized_value` to guarantee that `v` has been # initialized before its value is used to initialize `w`. # The random values are picked only once. w = tf.Variable(v.initialized_value() * 2.0) Returns: A Tensor holding the value of this variable after its initializer has run. Changing a variable value. tf.Variable.assign(value, use_locking=False) Assigns a new value to the variable. This is essentially a shortcut for assign(self, value). Args: ? value: A Tensor. The new value for this variable. ? use_locking: If True, use locking during the assignment. Returns: A Tensor that will hold the new value of this variable after the assignment has completed. tf.Variable.assign_add(delta, use_locking=False) Adds a value to this variable. 31
This is essentially a shortcut for assign_add(self, delta). Args: ? delta: A Tensor. The value to add to this variable. ? use_locking: If True, use locking during the operation. Returns: A Tensor that will hold the new value of this variable after the addition has completed. tf.Variable.assign_sub(delta, use_locking=False) Subtracts a value from this variable. This is essentially a shortcut for assign_sub(self, delta). Args: ? delta: A Tensor. The value to subtract from this variable. ? use_locking: If True, use locking during the operation. Returns: A Tensor that will hold the new value of this variable after the subtraction has completed. tf.Variable.scatter_sub(sparse_delta, use_locking=False) Subtracts IndexedSlices from this variable. This is essentially a shortcut for scatter_sub(self, sparse_delta.indices, sparse_delta.values). Args: ? sparse_delta: IndexedSlices to be subtracted from this variable. ? use_locking: If True, use locking during the operation. Returns: A Tensor that will hold the new value of this variable after the scattered subtraction has completed. Raises: ? ValueError: if sparse_delta is not an IndexedSlices. tf.Variable.count_up_to(limit) Increments this variable until it reaches limit. When that Op is run it tries to increment the variable by 1. If incrementing the variable would bring it above limit then the Op raises the exception OutOfRangeError. If no error is raised, the Op outputs the value of the variable before the increment. This is essentially a shortcut for count_up_to(self, limit). Args: ? limit: value at which incrementing the variable raises an error. Returns: A Tensor that will hold the variable value before the increment. If no other Op modifies this variable, the values produced will all be distinct. tf.Variable.eval(session=None) In a session, computes and returns the value of this variable. This is not a graph construction method, it does not add ops to the graph. 32
This convenience method requires a session where the graph containing this variable has been launched. If no session is passed, the default session is used. See the Session class for more information on launching a graph and on sessions. v = tf.Variable([1, 2]) init = tf.initialize_all_variables() with tf.Session() as sess: sess.run(init) # Usage passing the session explicitly. print v.eval(sess) # Usage with the default session. The 'with' block # above makes 'sess' the default session. print v.eval() Args: ? session: The session to use to evaluate this variable. If none, the default session is used. Returns: A numpy ndarray with a copy of the value of this variable. Properties. tf.Variable.name The name of this variable. tf.Variable.dtype The DType of this variable. tf.Variable.get_shape() The TensorShape of this variable. Returns: A TensorShape. tf.Variable.device The device of this variable. tf.Variable.initializer The initializer operation for this variable. tf.Variable.graph The Graph of this variable. tf.Variable.op The Operation of this variable. Variable helper functions TensorFlow provides a set of functions to help manage the set of variables collected in the graph. 33
tf.all_variables() Returns all variables collected in the graph. The Variable() constructor automatically adds new variables to the graph collection GraphKeys.VARIABLES. This convenience function returns the contents of that collection. Returns: A list of Variable objects. tf.trainable_variables() Returns all variables created with trainable=True. When passed trainable=True, the Variable() constructor automatically adds new variables to the graph collection GraphKeys.TRAINABLE_VARIABLES. This convenience function returns the contents of that collection. Returns: A list of Variable objects. tf.initialize_all_variables() Returns an Op that initializes all variables. This is just a shortcut for initialize_variables(all_variables()) Returns: An Op that initializes all variables in the graph. tf.initialize_variables(var_list, name='init') Returns an Op that initializes a list of variables. After you launch the graph in a session, you can run the returned Op to initialize all the variables in var_list. This Op runs all the initializers of the variables in var_list in parallel. Calling initialize_variables() is equivalent to passing the list of initializers to Group(). If var_list is empty, however, the function still returns an Op that can be run. That Op just has no effect. Args: ? var_list: List of Variable objects to initialize. ? name: Optional name for the returned operation. Returns: An Op that run the initializers of all the specified variables. tf.assert_variables_initialized(var_list=None) Returns an Op to check if variables are initialized. When run, the returned Op will raise the exception FailedPreconditionError if any of the variables has not yet been initialized. Note: This function is implemented by trying to fetch the values of the variables. If one of the variables is not initialized a message may be logged by the C++ runtime. This is expected. Args: ? var_list: List of Variable objects to check. Defaults to the value of all_variables(). Returns: An Op, or None if there are no variables. 34
Saving and Restoring Variables
class tf.train.Saver
Saves and restores variables.
See Variables for an overview of variables, saving and restoring. The Saver class adds ops to save and restore variables to and from checkpoints. It also provides convenience methods to run these ops.
Checkpoints are binary files in a proprietary format which map variable names to tensor values. The best way to examine the contents of a checkpoint is to load it using a Saver.
Savers can automatically number checkpoint filenames with a provided counter. This lets you keep multiple checkpoints at different steps while training a model. For example you can
number the checkpoint filenames with the training step number. To avoid filling up disks, savers manage checkpoint files automatically. For example, they can keep only the N most recent files, or one checkpoint for every N hours of training.
You number checkpoint filenames by passing a value to the optional global_step argument to save():
saver.save(sess, 'my-model', global_step=0) ==> filename: 'my-model-0' ...
saver.save(sess, 'my-model', global_step=1000) ==> filename: 'my-model-1000'
Additionally, optional arguments to the Saver() constructor let you control the proliferation of checkpoint files on disk: ? max_to_keep indicates the maximum number of recent checkpoint files to keep. As new files are created, older files are deleted. If None or 0, all checkpoint files are kept. Defaults to 5 (that is, the 5 most recent checkpoint files are kept.) ? keep_checkpoint_every_n_hours: In addition to keeping the most recent max_to_keep checkpoint files, you might want to keep one checkpoint file for every N hours of training. This can be useful if you want to later analyze how a model progressed during a long training session. For example, passing keep_checkpoint_every_n_hours=2 ensures that you keep one checkpoint file for every 2 hours of training. The default value of 10,000 hours effectively disables the feature.
Note that you still have to call the save() method to save the model. Passing these arguments to the constructor will not save variables automatically for you. A training program that saves regularly looks like: ...
# Create a saver.
saver = tf.train.Saver(...variables...)
# Launch the graph and train, saving the model every 1,000 steps. sess = tf.Session()
for step in xrange(1000000): sess.run(..training_op..) if step % 1000 == 0:
# Append the step number to the checkpoint name:
35
saver.save(sess, 'my-model', global_step=step)
In addition to checkpoint files, savers keep a protocol buffer on disk with the list of recent checkpoints. This is used to manage numbered checkpoint files and by latest_checkpoint(), which makes it easy to discover the path to the most recent checkpoint. That protocol buffer is stored in a file named 'checkpoint' next to the checkpoint files.
If you create several savers, you can specify a different filename for the protocol buffer file in the call to save().
tf.train.Saver.__init__(var_list=None, reshape=False, sharded=False, max_to_keep=5, keep_checkpoint_every_n_hours=10000.0, name=None, restore_sequentially=False, saver_def=None, builder=None) Creates a Saver.
The constructor adds ops to save and restore variables.
var_list specifies the variables that will be saved and restored. It can be passed as a dict or a list: ? A dict of names to variables: The keys are the names that will be used to save or restore the variables in the checkpoint files. ? A list of variables: The variables will be keyed with their op name in the checkpoint files.
For example:
v1 = tf.Variable(..., name='v1') v2 = tf.Variable(..., name='v2')
# Pass the variables as a dict:
saver = tf.train.Saver({'v1': v1, 'v2': v2})
# Or pass them as a list.
saver = tf.train.Saver([v1, v2])
# Passing a list is equivalent to passing a dict with the variable op names # as keys:
saver = tf.train.Saver({v.op.name: v for v in [v1, v2]})
The optional reshape argument, if True, allows restoring a variable from a save file where the variable had a different shape, but the same number of elements and type. This is useful if you have reshaped a variable and want to reload it from an older checkpoint.
The optional sharded argument, if True, instructs the saver to shard checkpoints per device. Args: ? var_list: A list of Variables or a dictionary mapping names to Variables. If None, defaults to the list of all variables. ? reshape: If True, allows restoring parameters from a checkpoint where the variables have a different shape. ? sharded: If True, shard the checkpoints, one per device. ? max_to_keep: maximum number of recent checkpoints to keep. Defaults to 10,000 hours. ? keep_checkpoint_every_n_hours: How often to keep checkpoints. Defaults to 10,000 hours.
36
name: string. Optional name to use as a prefix when adding operations.
? restore_sequentially: A Bool, which if true, causes restore of different variables to happen sequentially within each device. This can lower memory usage when restoring very large models. ? saver_def: Optional SaverDef proto to use instead of running the builder. This is only useful for specialty code that wants to recreate a Saver object for a previously built Graph that had a Saver. The saver_def proto should be the one returned by the as_saver_def() call of the Saver that was created for that Graph. ? builder: Optional SaverBuilder to use if a saver_def was not provided. Defaults to BaseSaverBuilder(). Raises: ? TypeError: If var_list is invalid. ? ValueError: If any of the keys or values in var_list is not unique.
?
tf.train.Saver.save(sess, save_path, global_step=None, latest_filename=None) Saves variables.
This method runs the ops added by the constructor for saving variables. It requires a session in which the graph was launched. The variables to save must also have been initialized. The method returns the path of the newly created checkpoint file. This path can be passed directly to a call to restore(). Args: ? sess: A Session to use to save the variables. ? save_path: string. Path to the checkpoint filename. If the saver is sharded, this is the prefix of the sharded checkpoint filename. ? global_step: If provided the global step number is appended to save_path to create the checkpoint filename. The optional argument can be a Tensor, a Tensor name or an integer. ? latest_filename: Optional name for the protocol buffer file that will contains the list of most recent checkpoint filenames. That file, kept in the same directory as the checkpoint files, is automatically managed by the saver to keep track of recent checkpoints. Defaults to 'checkpoint'. Returns:
A string: path at which the variables were saved. If the saver is sharded, this string ends with: '-?????-of-nnnnn' where 'nnnnn' is the number of shards created. Raises: ? TypeError: If sess is not a Session.
tf.train.Saver.restore(sess, save_path) Restores previously saved variables.
This method runs the ops added by the constructor for restoring variables. It requires a session in which the graph was launched. The variables to restore do not have to have been initialized, as restoring is itself a way to initialize variables.
The save_path argument is typically a value previously returned from a save() call, or a call to latest_checkpoint().
37
Args: sess: A Session to use to restore the parameters. ? save_path: Path where parameters were previously saved. Other utility methods. ? tf.train.Saver.last_checkpoints List of not-yet-deleted checkpoint filenames. You can pass any of the returned values to restore(). Returns: A list of checkpoint filenames, sorted from oldest to newest. tf.train.Saver.set_last_checkpoints(last_checkpoints) Sets the list of not-yet-deleted checkpoint filenames. Args: ? last_checkpoints: a list of checkpoint filenames. Raises: ? AssertionError: if the list of checkpoint filenames has already been set. tf.train.Saver.as_saver_def() Generates a SaverDef representation of this saver. Returns: A SaverDef proto. tf.train.latest_checkpoint(checkpoint_dir, latest_filename=None) Finds the filename of latest saved checkpoint file. Args: ? checkpoint_dir: Directory where the variables were saved. ? latest_filename: Optional name for the protocol buffer file that contains the list of most recent checkpoint filenames. See the corresponding argument to Saver.save(). Returns: The full path to the latest checkpoint or None if no checkpoint was found. tf.train.get_checkpoint_state(checkpoint_dir, latest_filename=None) Returns CheckpointState proto from the \If the \Args: ? checkpoint_dir: The directory of checkpoints. ? latest_filename: Optional name of the checkpoint file. Default to 'checkpoint'. Returns: A CheckpointState if the state was available, None otherwise. tf.train.update_checkpoint_state(save_dir, model_checkpoint_path, all_model_checkpoint_paths=None, latest_filename=None) 38
Updates the content of the 'checkpoint' file.
This updates the checkpoint file containing a CheckpointState proto. Args: ? save_dir: Directory where the model was saved. ? model_checkpoint_path: The checkpoint file. ? all_model_checkpoint_paths: list of strings. Paths to all not-yet-deleted checkpoints, sorted from oldest to newest. If this is a non-empty list, the last element must be equal to model_checkpoint_path. These paths are also saved in the CheckpointState proto. ? latest_filename: Optional name of the checkpoint file. Default to 'checkpoint'. Raises: ? RuntimeError: If the save paths conflict.
Sharing Variables
TensorFlow provides several classes and operations that you can use to create variables contingent on certain conditions.
tf.get_variable(name, shape=None, dtype=tf.float32, initializer=None, trainable=True, collections=None)
Gets an existing variable with these parameters or create a new one.
This function prefixes the name with the current variable scope and performs reuse checks. See the Variable Scope How To for an extensive description of how reusing works. Here is a basic example:
with tf.variable_scope(\
v = get_variable(\ # v.name == \ w = get_variable(\ # w.name == \with tf.variable_scope(\
v1 = get_variable(\ # The same as v above.
If initializer is None (the default), the default initializer passed in the constructor is used. If that one is None too, a UniformUnitScalingInitializer will be used. Args: ? name: the name of the new or existing variable. ? shape: shape of the new or existing variable. ? dtype: type of the new or existing variable (defaults to DT_FLOAT). ? initializer: initializer for the variable if one is created. ? trainable: If True also add the variable to the graph collection GraphKeys.TRAINABLE_VARIABLES (see variables.Variable). ? collections: List of graph collections keys to add the Variable to. Defaults to [GraphKeys.VARIABLES] (see variables.Variable). Returns:
The created or existing variable. Raises: ? ValueError: when creating a new variable and shape is not declared, or when violating reuse during variable creation. Reuse is set inside variable_scope.
39
tf.get_variable_scope()
Returns the current variable scope.
tf.variable_scope(name_or_scope, reuse=None, initializer=None) Returns a context for variable scope.
Variable scope allows to create new variables and to share already created ones while providing checks to not create or share by accident. For details, see the Variable Scope How To, here we present only a few basic examples.
Simple example of how to create a new variable: with tf.variable_scope(\
with tf.variable_scope(\ v = tf.get_variable(\ assert v.name == \Basic example of sharing a variable: with tf.variable_scope(\ v = get_variable(\
with tf.variable_scope(\ v1 = tf.get_variable(\assert v1 == v
Sharing a variable by capturing a scope and setting reuse: with tf.variable_scope(\ v = get_variable(\ scope.reuse_variables()
v1 = tf.get_variable(\assert v1 == v
To prevent accidental sharing of variables, we raise an exception when getting an existing variable in a non-reusing scope.
with tf.variable_scope(\ v = get_variable(\ v1 = tf.get_variable(\
# Raises ValueError(\
Similarly, we raise an exception when trying to get a variable that does not exist in reuse mode. with tf.variable_scope(\ v = get_variable(\
# Raises ValueError(\
Note that the reuse flag is inherited: if we open a reusing scope, then all its sub-scopes become reusing as well. Args: ? name_or_scope: string or VariableScope: the scope to open. ? reuse: True or None; if True, we go into reuse mode for this scope as well as all sub-scopes; if None, we just inherit the parent scope reuse. ? initializer: default initializer for variables within this scope. Yields:
A scope that can be to captured and reused.
40
Raises: ? ValueError: when trying to reuse within a create scope, or create within a reuse scope, or if reuse is not None or True. ? TypeError: when the types of some arguments are not appropriate. tf.constant_initializer(value=0.0) Returns an initializer that generates Tensors with a single value. Args: ? value: A Python scalar. All elements of the initialized variable will be set to this value. Returns: An initializer that generates Tensors with a single value. tf.random_normal_initializer(mean=0.0, stddev=1.0, seed=None) Returns an initializer that generates Tensors with a normal distribution. Args: ? mean: a python scalar or a scalar tensor. Mean of the random values to generate. ? stddev: a python scalar or a scalar tensor. Standard deviation of the random values to generate. ? seed: A Python integer. Used to create random seeds. See set_random_seed for behavior. Returns: An initializer that generates Tensors with a normal distribution. tf.truncated_normal_initializer(mean=0.0, stddev=1.0, seed=None) Returns an initializer that generates a truncated normal distribution. These values are similar to values from a random_normal_initializer except that values more than two standard deviations from the mean are discarded and re-drawn. This is the recommended initializer for neural network weights and filters. Args: ? mean: a python scalar or a scalar tensor. Mean of the random values to generate. ? stddev: a python scalar or a scalar tensor. Standard deviation of the random values to generate. ? seed: A Python integer. Used to create random seeds. See set_random_seed for behavior. Returns: An initializer that generates Tensors with a truncated normal distribution. tf.random_uniform_initializer(minval=0.0, maxval=1.0, seed=None) Returns an initializer that generates Tensors with a uniform distribution. Args: ? minval: a python scalar or a scalar tensor. lower bound of the range of random values to generate. ? maxval: a python scalar or a scalar tensor. upper bound of the range of random values to generate. 41
seed: A Python integer. Used to create random seeds. See set_random_seed for behavior. Returns: An initializer that generates Tensors with a uniform distribution. ? tf.uniform_unit_scaling_initializer(factor=1.0, seed=None) Returns an initializer that generates tensors without scaling variance. When initializing a deep network, it is in principle advantageous to keep the scale of the input variance constant, so it does not explode or diminish by reaching the final layer. If the input is x and the operation x * W, and we want to initialize W uniformly at random, we need to pick W from [-sqrt(3) / sqrt(dim), sqrt(3) / sqrt(dim)] to keep the scale intact, where dim = W.shape[0] (the size of the input). A similar calculation for convolutional networks gives an analogous result with dim equal to the product of the first 3 dimensions. When nonlinearities are present, we need to multiply this by a constant factor. See https://arxiv.org/pdf/1412.6558v3.pdf for deeper motivation, experiments and the calculation of constants. In section 2.3 there, the constants were numerically computed: for a linear layer it's 1.0, relu: ~1.43, tanh: ~1.15. Args: ? factor: Float. A multiplicative factor by which the values will be scaled. ? seed: A Python integer. Used to create random seeds. See set_random_seed for behavior. Returns: An initializer that generates tensors with unit variance. tf.zeros_initializer(shape, dtype=tf.float32) An adaptor for zeros() to match the Initializer spec. Sparse Variable Updates The sparse update ops modify a subset of the entries in a dense Variable, either overwriting the entries or adding / subtracting a delta. These are useful for training embedding models and similar lookup-based networks, since only a small subset of embedding vectors change in any given step. Since a sparse update of a large tensor may be generated automatically during gradient computation (as in the gradient of tf.gather), an IndexedSlices class is provided that encapsulates a set of sparse indices and values. IndexedSlices objects are detected and handled automatically by the optimizers in most cases. tf.scatter_update(ref, indices, updates, use_locking=None, name=None) Applies sparse updates to a variable reference. This operation computes # Scalar indices ref[indices, ...] = updates[...] # Vector indices (for each i) ref[indices[i], ...] = updates[i, ...] 42
# High rank indices (for each i, ..., j)
ref[indices[i, ..., j], ...] = updates[i, ..., j, ...]
This operation outputs ref after the update is done. This makes it easier to chain operations that need to use the reset value.
If indices contains duplicate entries, lexicographically later entries override earlier entries. Requires updates.shape = indices.shape + ref.shape[1:].
Args:
ref: A mutable Tensor. Should be from a Variable node.
? indices: A Tensor. Must be one of the following types: int32, int64. A tensor of indices into the first dimension of ref. ? updates: A Tensor. Must have the same type as ref. A tensor of updated values to store in ref. ? use_locking: An optional bool. Defaults to True. If True, the assignment will be protected by a lock; otherwise the behavior is undefined, but may exhibit less contention. ? name: A name for the operation (optional). Returns:
Same as ref. Returned as a convenience for operations that want to use the updated values after the update is done.
?
tf.scatter_add(ref, indices, updates, use_locking=None, name=None) Adds sparse updates to a variable reference. This operation computes # Scalar indices
ref[indices, ...] += updates[...]
# Vector indices (for each i)
ref[indices[i], ...] += updates[i, ...]
# High rank indices (for each i, ..., j)
ref[indices[i, ..., j], ...] += updates[i, ..., j, ...]
This operation outputs ref after the update is done. This makes it easier to chain operations that need to use the reset value.
Duplicate entries are handled correctly: if multiple indices reference the same location, their contributions add.
Requires updates.shape = indices.shape + ref.shape[1:].
43
Args:
ref: A mutable Tensor. Must be one of the following types: float32, float64, int64, int32,
uint8, int16, int8, complex64, qint8, quint8, qint32. Should be from a Variable node. ? indices: A Tensor. Must be one of the following types: int32, int64. A tensor of indices into the first dimension of ref. ? updates: A Tensor. Must have the same type as ref. A tensor of updated values to add to ref. ? use_locking: An optional bool. Defaults to False. If True, the addition will be protected by a lock; otherwise the behavior is undefined, but may exhibit less contention. ? name: A name for the operation (optional). Returns:
Same as ref. Returned as a convenience for operations that want to use the updated values after the update is done.
?
tf.scatter_sub(ref, indices, updates, use_locking=None, name=None) Subtracts sparse updates to a variable reference. # Scalar indices
ref[indices, ...] -= updates[...]
# Vector indices (for each i)
ref[indices[i], ...] -= updates[i, ...]
# High rank indices (for each i, ..., j)
ref[indices[i, ..., j], ...] -= updates[i, ..., j, ...]
This operation outputs ref after the update is done. This makes it easier to chain operations that need to use the reset value.
Duplicate entries are handled correctly: if multiple indices reference the same location, their (negated) contributions add.
Requires updates.shape = indices.shape + ref.shape[1:].
44
Args:
ref: A mutable Tensor. Must be one of the following types: float32, float64, int64, int32,
uint8, int16, int8, complex64, qint8, quint8, qint32. Should be from a Variable node. ? indices: A Tensor. Must be one of the following types: int32, int64. A tensor of indices into the first dimension of ref. ? updates: A Tensor. Must have the same type as ref. A tensor of updated values to subtract from ref. ? use_locking: An optional bool. Defaults to False. If True, the subtraction will be protected by a lock; otherwise the behavior is undefined, but may exhibit less contention. ? name: A name for the operation (optional). Returns:
Same as ref. Returned as a convenience for operations that want to use the updated values after the update is done.
?
tf.sparse_mask(a, mask_indices, name=None) Masks elements of IndexedSlices.
Given an IndexedSlices instance a, returns another IndexedSlices that contains a subset of the slices of a. Only the slices at indices specified in mask_indices are returned.
This is useful when you need to extract a subset of slices in an IndexedSlices object. For example:
# `a` contains slices at indices [12, 26, 37, 45] from a large tensor # with shape [1000, 10] a.indices => [12, 26, 37, 45] tf.shape(a.values) => [4, 10]
# `b` will be the subset of `a` slices at its second and third indices, so # we want to mask of its first and last indices (which are at absolute # indices 12, 45)
b = tf.sparse_mask(a, [12, 45])
b.indices => [26, 37]
tf.shape(b.values) => [2, 10] Args: ? a: An IndexedSlices instance. ? mask_indices: Indices of elements to mask. ? name: A name for the operation (optional). Returns:
The masked IndexedSlices instance.
class tf.IndexedSlices
A sparse representation of a set of tensor slices at given indices. This class is a simple wrapper for a pair of Tensor objects: ? values: A Tensor of any dtype with shape [D0, D1, ..., Dn]. ? indices: A 1-D integer Tensor with shape [D0].
45
An IndexedSlices is typically used to represent a subset of a larger tensor dense of shape [LARGE0, D1, .. , DN] where LARGE0 >> D0. The values in indices are the indices in the first dimension of the slices that have been extracted from the larger tensor. The dense tensor dense represented by an IndexedSlices slices has dense[slices.indices[i], :, :, :, ...] = slices.values[i, :, :, :, ...] The IndexedSlices class is used principally in the definition of gradients for operations that have sparse gradients (e.g. tf.gather). Contrast this representation with SparseTensor, which uses multi-dimensional indices and scalar values. tf.IndexedSlices.__init__(values, indices, dense_shape=None) Creates an IndexedSlices. tf.IndexedSlices.values A Tensor containing the values of the slices. tf.IndexedSlices.indices A 1-D Tensor containing the indices of the slices. tf.IndexedSlices.dense_shape A 1-D Tensor containing the shape of the corresponding dense tensor. tf.IndexedSlices.name The name of this IndexedSlices. tf.IndexedSlices.dtype The DType of elements in this tensor. tf.IndexedSlices.device The name of the device on which values will be produced, or None. tf.IndexedSlices.op The Operation that produces values as an output. Class tensorflow::Session
A Session instance lets a caller drive a TensorFlow graph computation.
When a Session is created with a given target, a new Session object is bound to the universe of resources specified by that target. Those resources are available to this session to perform
computation described in the GraphDef. After extending the session with a graph, the caller uses the Run() API to perform the computation and potentially fetch outputs as Tensors. Example:
```c++ tensorflow::GraphDef graph; // ... Create or load graph into \// This example uses the default options which connects // to a local runtime.
tensorflow::SessionOptions options; std::unique_ptr session(tensorflow::NewSession(options));
46
// Create the session with this graph. tensorflow::Status s = session->Create(graph); if (!s.ok()) { ... }
// Run the graph and fetch the first output of the \return anything // for the \{\
// Map the output as a flattened float tensor, and do something // with it. auto output_tensor = outputs[0].flat(); if (output_tensor(0) > 0.5) { ... }
// Close the session to release the resources associated with // this session. session->Close() ```
A Session allows concurrent calls to Run() , though a Session must be created / extended by a single thread.
Only one thread must call Close() , and Close() must only be called after all other calls to Run() have returned.
Member Summary
? o ? o ?
o ? o ?
virtual Status tensorflow::Session::Create(const GraphDef &graph)=0 Create the graph to be used for the session.
virtual Status tensorflow::Session::Extend(const GraphDef &graph)=0 Adds operations to the graph that is already registered with the Session .
virtual Status tensorflow::Session::Run(const std::vector< std::pair< string, Tensor > > &inputs, const std::vector< string > &output_tensor_names, const std::vector< string > &target_node_names, std::vector< Tensor > *outputs)=0
Runs the graph with the provided input tensors and fills outputs for the endpoints specified
inoutput_tensor_names. Runs to but does not return Tensors for the nodes intarget_node_names. virtual Status tensorflow::Session::Close()=0 Closes this session.
virtual tensorflow::Session::~Session()
Member Details
virtual Status tensorflow::Session::Create(const GraphDef &graph)=0
Create the graph to be used for the session.
Returns an error if this session has already been created with a graph. To re-use the session with a different graph, the caller must Close() the session first.
virtual Status tensorflow::Session::Extend(const GraphDef &graph)=0
Adds operations to the graph that is already registered with the Session .
The names of new operations in \
virtual Status tensorflow::Session::Run(const std::vector< std::pair< string, Tensor > > &inputs, const std::vector< string >
&output_tensor_names, const std::vector< string > &target_node_names, std::vector< Tensor > *outputs)=0
Runs the graph with the provided input tensors and fills outputs for the endpoints specified inoutput_tensor_names. Runs to but does not return Tensors for the nodes in target_node_names.
The order of tensors in outputs will match the order provided by output_tensor_names.
47
If Run returns OK(), then outputs->size() will be equal to output_tensor_names.size(). If Rundoes not return OK(), the state of outputs is undefined.
REQUIRES: The name of each Tensor of the input or output must match a \theGraphDef passed to Create().
REQUIRES: outputs is not nullptr if output_tensor_names is non-empty.
virtual Status tensorflow::Session::Close()=0
Closes this session.
Closing a session releases the resources used by this session on the TensorFlow runtime (specified during session creation by the SessionOptions::target field).
virtual tensorflow::Session::~Session()
Session management
class tf.Session
A class for running TensorFlow operations.
A Session object encapsulates the environment in which Operation objects are executed, and Tensorobjects are evaluated. For example: # Build a graph. a = tf.constant(5.0) b = tf.constant(6.0) c = a * b
# Launch the graph in a session. sess = tf.Session()
# Evaluate the tensor `c`. print sess.run(c)
A session may own resources, such as variables, queues, and readers. It is important to release these resources when they are no longer required. To do this, either invoke the close() method on the session, or use the session as a context manager. The following two examples are equivalent: # Using the `close()` method. sess = tf.Session() sess.run(...) sess.close()
# Using the context manager. with tf.Session() as sess: sess.run(...) The [ConfigProto]
(https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/config.proto) protocol buffer exposes various configuration options for a session. For example, to create a session that uses soft constraints for device placement, and log the resulting placement decisions, create a session as follows:
# Launch the graph in a session that allows soft device placement and
48
# logs the placement decisions.
sess = tf.Session(config=tf.ConfigProto(allow_soft_placement=True,
log_device_placement=True))
tf.Session.__init__(target='', graph=None, config=None)
Creates a new TensorFlow session.
If no graph argument is specified when constructing the session, the default graph will be launched in the session. If you are using more than one graph (created with tf.Graph() in the same process, you will have to use different sessions for each graph, but each graph can be used in multiple sessions. In this case, it is often clearer to pass the graph to be launched explicitly to the session constructor.
Args:
target: (Optional.) The execution engine to connect to. Defaults to using an in-process engine. At present, no value other than the empty string is supported. ? graph: (Optional.) The Graph to be launched (described above).
? config: (Optional.) A ConfigProto protocol buffer with configuration options for the session.
?
tf.Session.run(fetches, feed_dict=None)
Runs the operations and evaluates the tensors in fetches.
This method runs one \fragment to execute every Operation and evaluate every Tensor in fetches, substituting the values in feed_dict for the corresponding input values.
The fetches argument may be a list of graph elements or a single graph element, and these determine the return value of this method. A graph element can be one of the following types: If the ith element of fetches is an Operation, the ith return value will be None.
If the ith element of fetches is a Tensor, the ith return value will be a numpy ndarray containing the value of that tensor.
If the ith element of fetches is a SparseTensor, the ith return value will be a SparseTensorValuecontaining the value of that sparse tensor.
The optional feed_dict argument allows the caller to override the value of tensors in the graph. Each key infeed_dict can be one of the following types:
If the key is a Tensor, the value may be a Python scalar, string, list, or numpy ndarray that can be converted to the same dtype as that tensor. Additionally, if the key is a placeholder, the shape of the value will be checked for compatibility with the placeholder. If the key is a SparseTensor, the value should be a SparseTensorValue.
? ? ?
?
? ?
Args:
fetches: A single graph element, or a list of graph elements (described above). ? feed_dict: A dictionary that maps graph elements to values (described above).
Returns:
Either a single value if fetches is a single graph element, or a list of values if fetches is a list (described above). Raises:
? RuntimeError: If this Session is in an invalid state (e.g. has been closed). ? TypeError: If fetches or feed_dict keys are of an inappropriate type.
49
? ValueError: If fetches or feed_dict keys are invalid or refer to a Tensor that doesn't exist.
tf.Session.close()
Closes this session.
Calling this method frees all resources associated with the session.
Raises:
?
RuntimeError: If an error occurs while closing the session.
tf.Session.graph
The graph that was launched in this session.
tf.Session.as_default()
Returns a context manager that makes this object the default session.
Use with the with keyword to specify that calls to Operation.run() or Tensor.run() should be executed in this session. c = tf.constant(..) sess = tf.Session()
with sess.as_default():
assert tf.get_default_session() is sess print c.eval()
To get the current default session, use tf.get_default_session().
N.B. The as_default context manager does not close the session when you exit the context, and you must close the session explicitly. c = tf.constant(...) sess = tf.Session() with sess.as_default(): print c.eval() # ...
with sess.as_default(): print c.eval()
sess.close()
Alternatively, you can use with tf.Session(): to create a session that is automatically closed on exiting the context, including when an uncaught exception is raised.
N.B. The default graph is a property of the current thread. If you create a new thread, and wish to use the default session in that thread, you must explicitly add a with sess.as_default(): in that thread's function. Returns:
A context manager using this session as the default session.
50
正在阅读:
Tensorflow - 学习笔记 - 2016060604-21
三年级奥数31年龄问题例题再练(练习题)(11页)12-31
花心网络-这个教父级金融平台,濒临绝境05-17
区人民政府区长任期经济责任审计述职报告09-30
云南省腾冲县第六中学2014-2015学年高二上学期教学质量综合检测03-21
初爱02-14
08级期末复习题01-28
浮力及其产生原因、阿基米德原理、浮沉条件及其解题(学生版)07-24
高考小说阅读——理解文中重要语句 学案06-14
社科红标模版06-10
- 多层物业服务方案
- (审判实务)习惯法与少数民族地区民间纠纷解决问题(孙 潋)
- 人教版新课标六年级下册语文全册教案
- 词语打卡
- photoshop实习报告
- 钢结构设计原理综合测试2
- 2014年期末练习题
- 高中数学中的逆向思维解题方法探讨
- 名师原创 全国通用2014-2015学年高二寒假作业 政治(一)Word版
- 北航《建筑结构检测鉴定与加固》在线作业三
- XX县卫生监督所工程建设项目可行性研究报告
- 小学四年级观察作文经典评语
- 浅谈110KV变电站电气一次设计-程泉焱(1)
- 安全员考试题库
- 国家电网公司变电运维管理规定(试行)
- 义务教育课程标准稿征求意见提纲
- 教学秘书面试技巧
- 钢结构工程施工组织设计
- 水利工程概论论文
- 09届九年级数学第四次模拟试卷
- Tensorflow
- 20160606
- 笔记
- 学习
- IATF16949-2016质量管理体系各过程内部审核检查表(IATF16949内
- 《物流系统工程》习题与思考题参考答案1
- 2018版中国污水处理现状发展趋势走势分析报告目录
- 融资融券业务知识习题集11
- 黄带考试试题总及答案(黄色)-2016-2-5
- 《经济学原理:中国故事·全球视角》期末考试
- 大学物理下习题答案
- Powermill刀具路径点分布功能在编程中的应用 - 图文
- 2012年全国烟草工作会议姜成康讲话稿
- 英语六级cet-6学习资料
- 泰山总体规划调研报告
- 十五项核心制度
- 会计研究动态
- 《美丽中国与生态文明建设》2016年继续教育最全题库
- 全场梳理 补充论证 价值升华
- 超声科质量控制小组
- 四川大学申报高级专业技术职务评审简表2017年 - 图文
- 检测人员诚信检测承诺书
- 五蕴与十二因缘之关系及其哲学意义
- (14级新生必看!)重庆文理学院吃喝玩乐分享