python - Calculating gradient norm wrt weights with keras -
i attempting calculate gradient norm respect weights of neural network keras (as diagnostic tool). eventually, want create callback this, on way there have been working on creating function can compute gradient , return actual values in form of numpy array/scalar value (and not tensorflow tensor). code follows:
import numpy np import keras.backend k keras.layers import dense keras.models import sequential def get_gradient_norm_func(model): grads = k.gradients(model.total_loss, model.trainable_weights) summed_squares = [k.sum(k.square(g)) g in grads] norm = k.sqrt(sum(summed_squares)) func = k.function([model.input], [norm]) return func def main(): x = np.random.random((128,)).reshape((-1, 1)) y = 2 * x model = sequential(layers=[dense(2, input_shape=(1,)), dense(1)]) model.compile(loss='mse', optimizer='rmsprop') get_gradient = get_gradient_norm_func(model) history = model.fit(x, y, epochs=1) print(get_gradient([x])) if __name__ == '__main__': main() the code fails on call get_gradient(). traceback lengthy, involving lot shapes, little information on correct shape. how can correct this?
ideally, backend-agnostic solution, tensorflow-based solution option.
2017-08-15 15:39:14.914388: w tensorflow/core/framework/op_kernel.cc:1148] invalid argument: shape [-1,-1] has negative dimensions 2017-08-15 15:39:14.914414: e tensorflow/core/common_runtime/executor.cc:644] executor failed create kernel. invalid argument: shape [-1,-1] has negative dimensions [[node: dense_2_target = placeholder[dtype=dt_float, shape=[?,?], _device="/job:localhost/replica:0/task:0/cpu:0"]()]] 2017-08-15 15:39:14.915026: w tensorflow/core/framework/op_kernel.cc:1148] invalid argument: shape [-1,-1] has negative dimensions 2017-08-15 15:39:14.915038: e tensorflow/core/common_runtime/executor.cc:644] executor failed create kernel. invalid argument: shape [-1,-1] has negative dimensions [[node: dense_2_target = placeholder[dtype=dt_float, shape=[?,?], _device="/job:localhost/replica:0/task:0/cpu:0"]()]] 2017-08-15 15:39:14.915310: w tensorflow/core/framework/op_kernel.cc:1148] invalid argument: shape [-1] has negative dimensions 2017-08-15 15:39:14.915321: e tensorflow/core/common_runtime/executor.cc:644] executor failed create kernel. invalid argument: shape [-1] has negative dimensions [[node: dense_2_sample_weights = placeholder[dtype=dt_float, shape=[?], _device="/job:localhost/replica:0/task:0/cpu:0"]()]] traceback (most recent call last): file "/home/josteb/.local/opt/anaconda3/envs/timeseries/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1139, in _do_call return fn(*args) file "/home/josteb/.local/opt/anaconda3/envs/timeseries/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1121, in _run_fn status, run_metadata) file "/home/josteb/.local/opt/anaconda3/envs/timeseries/lib/python3.6/contextlib.py", line 89, in __exit__ next(self.gen) file "/home/josteb/.local/opt/anaconda3/envs/timeseries/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status pywrap_tensorflow.tf_getcode(status)) tensorflow.python.framework.errors_impl.invalidargumenterror: shape [-1] has negative dimensions [[node: dense_2_sample_weights = placeholder[dtype=dt_float, shape=[?], _device="/job:localhost/replica:0/task:0/cpu:0"]()]] during handling of above exception, exception occurred: traceback (most recent call last): file "gradientlog.py", line 45, in <module> main() file "gradientlog.py", line 42, in main print(get_gradient([x])) file "/home/josteb/sandbox/keras/keras/backend/tensorflow_backend.py", line 2251, in __call__ **self.session_kwargs) file "/home/josteb/.local/opt/anaconda3/envs/timeseries/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 789, in run run_metadata_ptr) file "/home/josteb/.local/opt/anaconda3/envs/timeseries/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 997, in _run feed_dict_string, options, run_metadata) file "/home/josteb/.local/opt/anaconda3/envs/timeseries/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1132, in _do_run target_list, options, run_metadata) file "/home/josteb/.local/opt/anaconda3/envs/timeseries/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1152, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.invalidargumenterror: shape [-1] has negative dimensions [[node: dense_2_sample_weights = placeholder[dtype=dt_float, shape=[?], _device="/job:localhost/replica:0/task:0/cpu:0"]()]] caused op 'dense_2_sample_weights', defined at: file "gradientlog.py", line 45, in <module> main() file "gradientlog.py", line 39, in main model.compile(loss='mse', optimizer='rmsprop') file "/home/josteb/sandbox/keras/keras/models.py", line 783, in compile **kwargs) file "/home/josteb/sandbox/keras/keras/engine/training.py", line 799, in compile name=name + '_sample_weights')) file "/home/josteb/sandbox/keras/keras/backend/tensorflow_backend.py", line 435, in placeholder x = tf.placeholder(dtype, shape=shape, name=name) file "/home/josteb/.local/opt/anaconda3/envs/timeseries/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py", line 1530, in placeholder return gen_array_ops._placeholder(dtype=dtype, shape=shape, name=name) file "/home/josteb/.local/opt/anaconda3/envs/timeseries/lib/python3.6/site-packages/tensorflow/python/ops/gen_array_ops.py", line 1954, in _placeholder name=name) file "/home/josteb/.local/opt/anaconda3/envs/timeseries/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op op_def=op_def) file "/home/josteb/.local/opt/anaconda3/envs/timeseries/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2506, in create_op original_op=self._default_original_op, op_def=op_def) file "/home/josteb/.local/opt/anaconda3/envs/timeseries/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1269, in __init__ self._traceback = _extract_stack() invalidargumenterror (see above traceback): shape [-1] has negative dimensions [[node: dense_2_sample_weights = placeholder[dtype=dt_float, shape=[?], _device="/job:localhost/replica:0/task:0/cpu:0"]()]]
there several placeholders related gradient computation process in keras:
- input
x - target
y - sample weights: if don't provide in
model.fit(), keras still generates placeholder sample weights, , feednp.ones((y.shape[0],), dtype=k.floatx())graph during training. - learning phase: placeholder connected gradient tensor if there's layer using (e.g.
dropout).
so, in provided example, in order compute gradients, need feed x, y , sample_weights graph. that's underlying reason of error.
inside model._make_train_function() there the following lines showing how construct necessary inputs k.function() in case:
inputs = self._feed_inputs + self._feed_targets + self._feed_sample_weights if self.uses_learning_phase , not isinstance(k.learning_phase(), int): inputs += [k.learning_phase()] k.name_scope('training'): ... self.train_function = k.function(inputs, [self.total_loss] + self.metrics_tensors, updates=updates, name='train_function', **self._function_kwargs) by mimicking function, should able norm value:
def get_gradient_norm_func(model): grads = k.gradients(model.total_loss, model.trainable_weights) summed_squares = [k.sum(k.square(g)) g in grads] norm = k.sqrt(sum(summed_squares)) inputs = model.model._feed_inputs + model.model._feed_targets + model.model._feed_sample_weights func = k.function(inputs, [norm]) return func def main(): x = np.random.random((128,)).reshape((-1, 1)) y = 2 * x model = sequential(layers=[dense(2, input_shape=(1,)), dense(1)]) model.compile(loss='mse', optimizer='rmsprop') get_gradient = get_gradient_norm_func(model) history = model.fit(x, y, epochs=1) print(get_gradient([x, y, np.ones(len(y))])) execution output:
epoch 1/1 128/128 [==============================] - 0s - loss: 2.0073 [4.4091368] note since you're using sequential instead of model, model.model._feed_* required instead of model._feed_*.
Comments
Post a Comment