Keras: What is model.inputs in VGG16
up vote
1
down vote
favorite
I start playing with keras and vgg16 recently, and I am using keras.applications.vgg16.
But here I come with a question about what is model.inputs
because I saw others using it in https://github.com/keras-team/keras/blob/master/examples/conv_filter_visualization.py although it does not initialize it
...
input_img = model.input
...
layer_output = layer_dict[layer_name].output
if K.image_data_format() == 'channels_first':
loss = K.mean(layer_output[:, filter_index, :, :])
else:
loss = K.mean(layer_output[:, :, :, filter_index])
# we compute the gradient of the input picture wrt this loss
grads = K.gradients(loss, input_img)[0]
I checked the keras site but it only said that is an input tensor with shape (1,224,224,3) But I still don't understand what is that exactly. Is that an image from ImageNet?Or a default image provided by keras for keras model?
I am sorry if I don't have enough understanding of deep learning, but can someone explain it to me please. Thanks
python tensorflow keras
add a comment |
up vote
1
down vote
favorite
I start playing with keras and vgg16 recently, and I am using keras.applications.vgg16.
But here I come with a question about what is model.inputs
because I saw others using it in https://github.com/keras-team/keras/blob/master/examples/conv_filter_visualization.py although it does not initialize it
...
input_img = model.input
...
layer_output = layer_dict[layer_name].output
if K.image_data_format() == 'channels_first':
loss = K.mean(layer_output[:, filter_index, :, :])
else:
loss = K.mean(layer_output[:, :, :, filter_index])
# we compute the gradient of the input picture wrt this loss
grads = K.gradients(loss, input_img)[0]
I checked the keras site but it only said that is an input tensor with shape (1,224,224,3) But I still don't understand what is that exactly. Is that an image from ImageNet?Or a default image provided by keras for keras model?
I am sorry if I don't have enough understanding of deep learning, but can someone explain it to me please. Thanks
python tensorflow keras
These are dimensions of your image. A shape of (1,224,224,3) means you have 1 image (I might be wrong on this one), with both height and width of 224 pixels, and 3 channels (RGB).
– Xiaoyu Lu
Nov 20 at 21:26
add a comment |
up vote
1
down vote
favorite
up vote
1
down vote
favorite
I start playing with keras and vgg16 recently, and I am using keras.applications.vgg16.
But here I come with a question about what is model.inputs
because I saw others using it in https://github.com/keras-team/keras/blob/master/examples/conv_filter_visualization.py although it does not initialize it
...
input_img = model.input
...
layer_output = layer_dict[layer_name].output
if K.image_data_format() == 'channels_first':
loss = K.mean(layer_output[:, filter_index, :, :])
else:
loss = K.mean(layer_output[:, :, :, filter_index])
# we compute the gradient of the input picture wrt this loss
grads = K.gradients(loss, input_img)[0]
I checked the keras site but it only said that is an input tensor with shape (1,224,224,3) But I still don't understand what is that exactly. Is that an image from ImageNet?Or a default image provided by keras for keras model?
I am sorry if I don't have enough understanding of deep learning, but can someone explain it to me please. Thanks
python tensorflow keras
I start playing with keras and vgg16 recently, and I am using keras.applications.vgg16.
But here I come with a question about what is model.inputs
because I saw others using it in https://github.com/keras-team/keras/blob/master/examples/conv_filter_visualization.py although it does not initialize it
...
input_img = model.input
...
layer_output = layer_dict[layer_name].output
if K.image_data_format() == 'channels_first':
loss = K.mean(layer_output[:, filter_index, :, :])
else:
loss = K.mean(layer_output[:, :, :, filter_index])
# we compute the gradient of the input picture wrt this loss
grads = K.gradients(loss, input_img)[0]
I checked the keras site but it only said that is an input tensor with shape (1,224,224,3) But I still don't understand what is that exactly. Is that an image from ImageNet?Or a default image provided by keras for keras model?
I am sorry if I don't have enough understanding of deep learning, but can someone explain it to me please. Thanks
python tensorflow keras
python tensorflow keras
edited Nov 22 at 4:53
asked Nov 20 at 14:39
cwl6
83
83
These are dimensions of your image. A shape of (1,224,224,3) means you have 1 image (I might be wrong on this one), with both height and width of 224 pixels, and 3 channels (RGB).
– Xiaoyu Lu
Nov 20 at 21:26
add a comment |
These are dimensions of your image. A shape of (1,224,224,3) means you have 1 image (I might be wrong on this one), with both height and width of 224 pixels, and 3 channels (RGB).
– Xiaoyu Lu
Nov 20 at 21:26
These are dimensions of your image. A shape of (1,224,224,3) means you have 1 image (I might be wrong on this one), with both height and width of 224 pixels, and 3 channels (RGB).
– Xiaoyu Lu
Nov 20 at 21:26
These are dimensions of your image. A shape of (1,224,224,3) means you have 1 image (I might be wrong on this one), with both height and width of 224 pixels, and 3 channels (RGB).
– Xiaoyu Lu
Nov 20 at 21:26
add a comment |
1 Answer
1
active
oldest
votes
up vote
3
down vote
The 4 dimensions of (1,224,224,3)
are the batch_size
, image_width
, image_height
and image_channels
respectively. (1,224,224,3)
means that the VGG16
model accepts a batch size of 1
(one image at a time) of shape 224x224
and three channels (RGB).
For more information on what a batch
and therefore a batch size
is, you can check this Cross Validated question.
Returning to VGG16
, the input of the architecture is (1, 224, 224, 3)
. What does this mean? That in order to input a image into the network, you will need to:
- Preprocess it to reach a shape of (224, 224) and 3 channels (RGB)
- Convert this to an actual matrix of shape (224, 224, 3)
- Group together various images in a batch of the size that requires the network (in this case, the batch size is 1, but you need to add a dimension to the matrix, in order to obtain the (1, 224, 224, 3)
After doing this, you can input the image to the model.
Keras offers few utilitary functions to do these tasks. Below I present a modified version of the code snippet shown in Extract features with VGG16 from Usage examples for image classification models in the documentation.
In order to have it actually working, you need a jpg
of any size named elephant.jpg
. You can obtain it running this bash command:
wget https://upload.wikimedia.org/wikipedia/commons/f/f9/Zoorashia_elephant.jpg -O elephant.jpg
I will split the code in the image preprocesing and the model prediction for clarity:
Load the image
import numpy as np
from keras.preprocessing import image
from keras.applications.vgg16 import preprocess_input
img_path = 'elephant.jpg'
img = image.load_img(img_path, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)
You can add prints along the way to see what's going on, but here is a brief summary:
image.load_img()
load a PIL image, already in RGB and already reshaping it to (224, 224)
image.img_to_array()
is translating this image into a matrix of shape (224, 224, 3). If you access, x[0, 0, 0] you will get the red component of the first pixels as a number between 0 and 255
np.expand_dims(x, axis=0)
is adding the first dimension. x after is has shape(1, 224, 224, 3)
preprocess_input
is doing an extra preprocessing required for imagenet-trained architectures. From its docstring (runhelp(preprocess_input)
) you can see that it:
will convert the images from RGB to BGR, then will zero-center each color channel with respect to the ImageNet dataset, without scaling
This seems to be the standard input for ImageNet training set.
That's it for the preprocessing, now you can just input the image in the pretrained model and get a prediction
Predict
y_hat = base_model.predict(x)
print(y_hat.shape) # res.shape (1, 1000)
y_hat
contains the probabilities for each of the 1000 imagenet classes the model assigned to this image.
In order to obtain the class names and a readable output, keras provided an utility function too:
from keras.applications.vgg16 import decode_predictions
decode_predictions(y_hat)
Outputs, for the Zoorashia_elephant.jpg
image I downloaded before:
[[('n02504013', 'Indian_elephant', 0.48041093),
('n02504458', 'African_elephant', 0.47474155),
('n01871265', 'tusker', 0.03912963),
('n02437312', 'Arabian_camel', 0.0038948185),
('n01704323', 'triceratops', 0.00062475674)]]
Which seems pretty good!
I am sorry I didn't make the question clear enough. I am looking at the code from github.com/keras-team/keras/blob/master/examples/…. It visualizes some filters, but it uses model.inputs to get a gradient: grads = K.gradients(loss, input_img)[0] But it does not initialize the model input, therefore I am confused is there any default input of keras VGG16 model
– cwl6
Nov 22 at 3:41
Hi @cwl6. I think this is a pretty different question from the previous one and my answer addressed the previous one extensively. You can always open a new question and the community will be glad to help!. Regarding this question, you can see on line91
that it's actually initalizing a random noise image with the lineinput_img_data = np.random.random((1, 3, img_width, img_height))
., with shape(1, 224, 224, 3)
, which fits my explanation. This input is plugged to the model in line98
using the definition ofiterate
done in84
.
– Julian Peller
Nov 22 at 4:05
Thank you for your reply. I know it takes your times to answer my question. But in line 78 , it used the model input to create a function. Anyway I will open another new question. Thanks
– cwl6
Nov 22 at 4:20
input_img
is a placeholder for themodel.input
. This is "plugged" toinput_img_data
usingiterate
on line84
, whereiterate
is a K.function which maps inputs (input_img_data
) to outputs (the model gradients and losses). The connection betweeninput_img_data
andinput_img
happens there, saying to iterate to takeinput_img_data
as input to obtain the gradients and losses of the model, it's implicitly saying to plug theinput_img_data
toinput_img
and, in turn, tomodel.inputs
. I'm not super confident with k
– Julian Peller
Nov 22 at 4:27
Now, I'm already researching :P ... I'm pretty noob with Keras too. The idea is the following: basically, you are just defining the structure of the model withmodel.input
as the input until you execute a line saying: "ok, this is the input". This line is line98
. This line (loss_value, grads_value = iterate([input_img_data])
) says: computeloss
andgrads
running the model defined before withinput_imag_data
as the input.iterate
is the connection between both.
– Julian Peller
Nov 22 at 4:36
|
show 1 more comment
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
3
down vote
The 4 dimensions of (1,224,224,3)
are the batch_size
, image_width
, image_height
and image_channels
respectively. (1,224,224,3)
means that the VGG16
model accepts a batch size of 1
(one image at a time) of shape 224x224
and three channels (RGB).
For more information on what a batch
and therefore a batch size
is, you can check this Cross Validated question.
Returning to VGG16
, the input of the architecture is (1, 224, 224, 3)
. What does this mean? That in order to input a image into the network, you will need to:
- Preprocess it to reach a shape of (224, 224) and 3 channels (RGB)
- Convert this to an actual matrix of shape (224, 224, 3)
- Group together various images in a batch of the size that requires the network (in this case, the batch size is 1, but you need to add a dimension to the matrix, in order to obtain the (1, 224, 224, 3)
After doing this, you can input the image to the model.
Keras offers few utilitary functions to do these tasks. Below I present a modified version of the code snippet shown in Extract features with VGG16 from Usage examples for image classification models in the documentation.
In order to have it actually working, you need a jpg
of any size named elephant.jpg
. You can obtain it running this bash command:
wget https://upload.wikimedia.org/wikipedia/commons/f/f9/Zoorashia_elephant.jpg -O elephant.jpg
I will split the code in the image preprocesing and the model prediction for clarity:
Load the image
import numpy as np
from keras.preprocessing import image
from keras.applications.vgg16 import preprocess_input
img_path = 'elephant.jpg'
img = image.load_img(img_path, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)
You can add prints along the way to see what's going on, but here is a brief summary:
image.load_img()
load a PIL image, already in RGB and already reshaping it to (224, 224)
image.img_to_array()
is translating this image into a matrix of shape (224, 224, 3). If you access, x[0, 0, 0] you will get the red component of the first pixels as a number between 0 and 255
np.expand_dims(x, axis=0)
is adding the first dimension. x after is has shape(1, 224, 224, 3)
preprocess_input
is doing an extra preprocessing required for imagenet-trained architectures. From its docstring (runhelp(preprocess_input)
) you can see that it:
will convert the images from RGB to BGR, then will zero-center each color channel with respect to the ImageNet dataset, without scaling
This seems to be the standard input for ImageNet training set.
That's it for the preprocessing, now you can just input the image in the pretrained model and get a prediction
Predict
y_hat = base_model.predict(x)
print(y_hat.shape) # res.shape (1, 1000)
y_hat
contains the probabilities for each of the 1000 imagenet classes the model assigned to this image.
In order to obtain the class names and a readable output, keras provided an utility function too:
from keras.applications.vgg16 import decode_predictions
decode_predictions(y_hat)
Outputs, for the Zoorashia_elephant.jpg
image I downloaded before:
[[('n02504013', 'Indian_elephant', 0.48041093),
('n02504458', 'African_elephant', 0.47474155),
('n01871265', 'tusker', 0.03912963),
('n02437312', 'Arabian_camel', 0.0038948185),
('n01704323', 'triceratops', 0.00062475674)]]
Which seems pretty good!
I am sorry I didn't make the question clear enough. I am looking at the code from github.com/keras-team/keras/blob/master/examples/…. It visualizes some filters, but it uses model.inputs to get a gradient: grads = K.gradients(loss, input_img)[0] But it does not initialize the model input, therefore I am confused is there any default input of keras VGG16 model
– cwl6
Nov 22 at 3:41
Hi @cwl6. I think this is a pretty different question from the previous one and my answer addressed the previous one extensively. You can always open a new question and the community will be glad to help!. Regarding this question, you can see on line91
that it's actually initalizing a random noise image with the lineinput_img_data = np.random.random((1, 3, img_width, img_height))
., with shape(1, 224, 224, 3)
, which fits my explanation. This input is plugged to the model in line98
using the definition ofiterate
done in84
.
– Julian Peller
Nov 22 at 4:05
Thank you for your reply. I know it takes your times to answer my question. But in line 78 , it used the model input to create a function. Anyway I will open another new question. Thanks
– cwl6
Nov 22 at 4:20
input_img
is a placeholder for themodel.input
. This is "plugged" toinput_img_data
usingiterate
on line84
, whereiterate
is a K.function which maps inputs (input_img_data
) to outputs (the model gradients and losses). The connection betweeninput_img_data
andinput_img
happens there, saying to iterate to takeinput_img_data
as input to obtain the gradients and losses of the model, it's implicitly saying to plug theinput_img_data
toinput_img
and, in turn, tomodel.inputs
. I'm not super confident with k
– Julian Peller
Nov 22 at 4:27
Now, I'm already researching :P ... I'm pretty noob with Keras too. The idea is the following: basically, you are just defining the structure of the model withmodel.input
as the input until you execute a line saying: "ok, this is the input". This line is line98
. This line (loss_value, grads_value = iterate([input_img_data])
) says: computeloss
andgrads
running the model defined before withinput_imag_data
as the input.iterate
is the connection between both.
– Julian Peller
Nov 22 at 4:36
|
show 1 more comment
up vote
3
down vote
The 4 dimensions of (1,224,224,3)
are the batch_size
, image_width
, image_height
and image_channels
respectively. (1,224,224,3)
means that the VGG16
model accepts a batch size of 1
(one image at a time) of shape 224x224
and three channels (RGB).
For more information on what a batch
and therefore a batch size
is, you can check this Cross Validated question.
Returning to VGG16
, the input of the architecture is (1, 224, 224, 3)
. What does this mean? That in order to input a image into the network, you will need to:
- Preprocess it to reach a shape of (224, 224) and 3 channels (RGB)
- Convert this to an actual matrix of shape (224, 224, 3)
- Group together various images in a batch of the size that requires the network (in this case, the batch size is 1, but you need to add a dimension to the matrix, in order to obtain the (1, 224, 224, 3)
After doing this, you can input the image to the model.
Keras offers few utilitary functions to do these tasks. Below I present a modified version of the code snippet shown in Extract features with VGG16 from Usage examples for image classification models in the documentation.
In order to have it actually working, you need a jpg
of any size named elephant.jpg
. You can obtain it running this bash command:
wget https://upload.wikimedia.org/wikipedia/commons/f/f9/Zoorashia_elephant.jpg -O elephant.jpg
I will split the code in the image preprocesing and the model prediction for clarity:
Load the image
import numpy as np
from keras.preprocessing import image
from keras.applications.vgg16 import preprocess_input
img_path = 'elephant.jpg'
img = image.load_img(img_path, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)
You can add prints along the way to see what's going on, but here is a brief summary:
image.load_img()
load a PIL image, already in RGB and already reshaping it to (224, 224)
image.img_to_array()
is translating this image into a matrix of shape (224, 224, 3). If you access, x[0, 0, 0] you will get the red component of the first pixels as a number between 0 and 255
np.expand_dims(x, axis=0)
is adding the first dimension. x after is has shape(1, 224, 224, 3)
preprocess_input
is doing an extra preprocessing required for imagenet-trained architectures. From its docstring (runhelp(preprocess_input)
) you can see that it:
will convert the images from RGB to BGR, then will zero-center each color channel with respect to the ImageNet dataset, without scaling
This seems to be the standard input for ImageNet training set.
That's it for the preprocessing, now you can just input the image in the pretrained model and get a prediction
Predict
y_hat = base_model.predict(x)
print(y_hat.shape) # res.shape (1, 1000)
y_hat
contains the probabilities for each of the 1000 imagenet classes the model assigned to this image.
In order to obtain the class names and a readable output, keras provided an utility function too:
from keras.applications.vgg16 import decode_predictions
decode_predictions(y_hat)
Outputs, for the Zoorashia_elephant.jpg
image I downloaded before:
[[('n02504013', 'Indian_elephant', 0.48041093),
('n02504458', 'African_elephant', 0.47474155),
('n01871265', 'tusker', 0.03912963),
('n02437312', 'Arabian_camel', 0.0038948185),
('n01704323', 'triceratops', 0.00062475674)]]
Which seems pretty good!
I am sorry I didn't make the question clear enough. I am looking at the code from github.com/keras-team/keras/blob/master/examples/…. It visualizes some filters, but it uses model.inputs to get a gradient: grads = K.gradients(loss, input_img)[0] But it does not initialize the model input, therefore I am confused is there any default input of keras VGG16 model
– cwl6
Nov 22 at 3:41
Hi @cwl6. I think this is a pretty different question from the previous one and my answer addressed the previous one extensively. You can always open a new question and the community will be glad to help!. Regarding this question, you can see on line91
that it's actually initalizing a random noise image with the lineinput_img_data = np.random.random((1, 3, img_width, img_height))
., with shape(1, 224, 224, 3)
, which fits my explanation. This input is plugged to the model in line98
using the definition ofiterate
done in84
.
– Julian Peller
Nov 22 at 4:05
Thank you for your reply. I know it takes your times to answer my question. But in line 78 , it used the model input to create a function. Anyway I will open another new question. Thanks
– cwl6
Nov 22 at 4:20
input_img
is a placeholder for themodel.input
. This is "plugged" toinput_img_data
usingiterate
on line84
, whereiterate
is a K.function which maps inputs (input_img_data
) to outputs (the model gradients and losses). The connection betweeninput_img_data
andinput_img
happens there, saying to iterate to takeinput_img_data
as input to obtain the gradients and losses of the model, it's implicitly saying to plug theinput_img_data
toinput_img
and, in turn, tomodel.inputs
. I'm not super confident with k
– Julian Peller
Nov 22 at 4:27
Now, I'm already researching :P ... I'm pretty noob with Keras too. The idea is the following: basically, you are just defining the structure of the model withmodel.input
as the input until you execute a line saying: "ok, this is the input". This line is line98
. This line (loss_value, grads_value = iterate([input_img_data])
) says: computeloss
andgrads
running the model defined before withinput_imag_data
as the input.iterate
is the connection between both.
– Julian Peller
Nov 22 at 4:36
|
show 1 more comment
up vote
3
down vote
up vote
3
down vote
The 4 dimensions of (1,224,224,3)
are the batch_size
, image_width
, image_height
and image_channels
respectively. (1,224,224,3)
means that the VGG16
model accepts a batch size of 1
(one image at a time) of shape 224x224
and three channels (RGB).
For more information on what a batch
and therefore a batch size
is, you can check this Cross Validated question.
Returning to VGG16
, the input of the architecture is (1, 224, 224, 3)
. What does this mean? That in order to input a image into the network, you will need to:
- Preprocess it to reach a shape of (224, 224) and 3 channels (RGB)
- Convert this to an actual matrix of shape (224, 224, 3)
- Group together various images in a batch of the size that requires the network (in this case, the batch size is 1, but you need to add a dimension to the matrix, in order to obtain the (1, 224, 224, 3)
After doing this, you can input the image to the model.
Keras offers few utilitary functions to do these tasks. Below I present a modified version of the code snippet shown in Extract features with VGG16 from Usage examples for image classification models in the documentation.
In order to have it actually working, you need a jpg
of any size named elephant.jpg
. You can obtain it running this bash command:
wget https://upload.wikimedia.org/wikipedia/commons/f/f9/Zoorashia_elephant.jpg -O elephant.jpg
I will split the code in the image preprocesing and the model prediction for clarity:
Load the image
import numpy as np
from keras.preprocessing import image
from keras.applications.vgg16 import preprocess_input
img_path = 'elephant.jpg'
img = image.load_img(img_path, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)
You can add prints along the way to see what's going on, but here is a brief summary:
image.load_img()
load a PIL image, already in RGB and already reshaping it to (224, 224)
image.img_to_array()
is translating this image into a matrix of shape (224, 224, 3). If you access, x[0, 0, 0] you will get the red component of the first pixels as a number between 0 and 255
np.expand_dims(x, axis=0)
is adding the first dimension. x after is has shape(1, 224, 224, 3)
preprocess_input
is doing an extra preprocessing required for imagenet-trained architectures. From its docstring (runhelp(preprocess_input)
) you can see that it:
will convert the images from RGB to BGR, then will zero-center each color channel with respect to the ImageNet dataset, without scaling
This seems to be the standard input for ImageNet training set.
That's it for the preprocessing, now you can just input the image in the pretrained model and get a prediction
Predict
y_hat = base_model.predict(x)
print(y_hat.shape) # res.shape (1, 1000)
y_hat
contains the probabilities for each of the 1000 imagenet classes the model assigned to this image.
In order to obtain the class names and a readable output, keras provided an utility function too:
from keras.applications.vgg16 import decode_predictions
decode_predictions(y_hat)
Outputs, for the Zoorashia_elephant.jpg
image I downloaded before:
[[('n02504013', 'Indian_elephant', 0.48041093),
('n02504458', 'African_elephant', 0.47474155),
('n01871265', 'tusker', 0.03912963),
('n02437312', 'Arabian_camel', 0.0038948185),
('n01704323', 'triceratops', 0.00062475674)]]
Which seems pretty good!
The 4 dimensions of (1,224,224,3)
are the batch_size
, image_width
, image_height
and image_channels
respectively. (1,224,224,3)
means that the VGG16
model accepts a batch size of 1
(one image at a time) of shape 224x224
and three channels (RGB).
For more information on what a batch
and therefore a batch size
is, you can check this Cross Validated question.
Returning to VGG16
, the input of the architecture is (1, 224, 224, 3)
. What does this mean? That in order to input a image into the network, you will need to:
- Preprocess it to reach a shape of (224, 224) and 3 channels (RGB)
- Convert this to an actual matrix of shape (224, 224, 3)
- Group together various images in a batch of the size that requires the network (in this case, the batch size is 1, but you need to add a dimension to the matrix, in order to obtain the (1, 224, 224, 3)
After doing this, you can input the image to the model.
Keras offers few utilitary functions to do these tasks. Below I present a modified version of the code snippet shown in Extract features with VGG16 from Usage examples for image classification models in the documentation.
In order to have it actually working, you need a jpg
of any size named elephant.jpg
. You can obtain it running this bash command:
wget https://upload.wikimedia.org/wikipedia/commons/f/f9/Zoorashia_elephant.jpg -O elephant.jpg
I will split the code in the image preprocesing and the model prediction for clarity:
Load the image
import numpy as np
from keras.preprocessing import image
from keras.applications.vgg16 import preprocess_input
img_path = 'elephant.jpg'
img = image.load_img(img_path, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)
You can add prints along the way to see what's going on, but here is a brief summary:
image.load_img()
load a PIL image, already in RGB and already reshaping it to (224, 224)
image.img_to_array()
is translating this image into a matrix of shape (224, 224, 3). If you access, x[0, 0, 0] you will get the red component of the first pixels as a number between 0 and 255
np.expand_dims(x, axis=0)
is adding the first dimension. x after is has shape(1, 224, 224, 3)
preprocess_input
is doing an extra preprocessing required for imagenet-trained architectures. From its docstring (runhelp(preprocess_input)
) you can see that it:
will convert the images from RGB to BGR, then will zero-center each color channel with respect to the ImageNet dataset, without scaling
This seems to be the standard input for ImageNet training set.
That's it for the preprocessing, now you can just input the image in the pretrained model and get a prediction
Predict
y_hat = base_model.predict(x)
print(y_hat.shape) # res.shape (1, 1000)
y_hat
contains the probabilities for each of the 1000 imagenet classes the model assigned to this image.
In order to obtain the class names and a readable output, keras provided an utility function too:
from keras.applications.vgg16 import decode_predictions
decode_predictions(y_hat)
Outputs, for the Zoorashia_elephant.jpg
image I downloaded before:
[[('n02504013', 'Indian_elephant', 0.48041093),
('n02504458', 'African_elephant', 0.47474155),
('n01871265', 'tusker', 0.03912963),
('n02437312', 'Arabian_camel', 0.0038948185),
('n01704323', 'triceratops', 0.00062475674)]]
Which seems pretty good!
edited Nov 23 at 22:13
answered Nov 20 at 22:45
Julian Peller
844511
844511
I am sorry I didn't make the question clear enough. I am looking at the code from github.com/keras-team/keras/blob/master/examples/…. It visualizes some filters, but it uses model.inputs to get a gradient: grads = K.gradients(loss, input_img)[0] But it does not initialize the model input, therefore I am confused is there any default input of keras VGG16 model
– cwl6
Nov 22 at 3:41
Hi @cwl6. I think this is a pretty different question from the previous one and my answer addressed the previous one extensively. You can always open a new question and the community will be glad to help!. Regarding this question, you can see on line91
that it's actually initalizing a random noise image with the lineinput_img_data = np.random.random((1, 3, img_width, img_height))
., with shape(1, 224, 224, 3)
, which fits my explanation. This input is plugged to the model in line98
using the definition ofiterate
done in84
.
– Julian Peller
Nov 22 at 4:05
Thank you for your reply. I know it takes your times to answer my question. But in line 78 , it used the model input to create a function. Anyway I will open another new question. Thanks
– cwl6
Nov 22 at 4:20
input_img
is a placeholder for themodel.input
. This is "plugged" toinput_img_data
usingiterate
on line84
, whereiterate
is a K.function which maps inputs (input_img_data
) to outputs (the model gradients and losses). The connection betweeninput_img_data
andinput_img
happens there, saying to iterate to takeinput_img_data
as input to obtain the gradients and losses of the model, it's implicitly saying to plug theinput_img_data
toinput_img
and, in turn, tomodel.inputs
. I'm not super confident with k
– Julian Peller
Nov 22 at 4:27
Now, I'm already researching :P ... I'm pretty noob with Keras too. The idea is the following: basically, you are just defining the structure of the model withmodel.input
as the input until you execute a line saying: "ok, this is the input". This line is line98
. This line (loss_value, grads_value = iterate([input_img_data])
) says: computeloss
andgrads
running the model defined before withinput_imag_data
as the input.iterate
is the connection between both.
– Julian Peller
Nov 22 at 4:36
|
show 1 more comment
I am sorry I didn't make the question clear enough. I am looking at the code from github.com/keras-team/keras/blob/master/examples/…. It visualizes some filters, but it uses model.inputs to get a gradient: grads = K.gradients(loss, input_img)[0] But it does not initialize the model input, therefore I am confused is there any default input of keras VGG16 model
– cwl6
Nov 22 at 3:41
Hi @cwl6. I think this is a pretty different question from the previous one and my answer addressed the previous one extensively. You can always open a new question and the community will be glad to help!. Regarding this question, you can see on line91
that it's actually initalizing a random noise image with the lineinput_img_data = np.random.random((1, 3, img_width, img_height))
., with shape(1, 224, 224, 3)
, which fits my explanation. This input is plugged to the model in line98
using the definition ofiterate
done in84
.
– Julian Peller
Nov 22 at 4:05
Thank you for your reply. I know it takes your times to answer my question. But in line 78 , it used the model input to create a function. Anyway I will open another new question. Thanks
– cwl6
Nov 22 at 4:20
input_img
is a placeholder for themodel.input
. This is "plugged" toinput_img_data
usingiterate
on line84
, whereiterate
is a K.function which maps inputs (input_img_data
) to outputs (the model gradients and losses). The connection betweeninput_img_data
andinput_img
happens there, saying to iterate to takeinput_img_data
as input to obtain the gradients and losses of the model, it's implicitly saying to plug theinput_img_data
toinput_img
and, in turn, tomodel.inputs
. I'm not super confident with k
– Julian Peller
Nov 22 at 4:27
Now, I'm already researching :P ... I'm pretty noob with Keras too. The idea is the following: basically, you are just defining the structure of the model withmodel.input
as the input until you execute a line saying: "ok, this is the input". This line is line98
. This line (loss_value, grads_value = iterate([input_img_data])
) says: computeloss
andgrads
running the model defined before withinput_imag_data
as the input.iterate
is the connection between both.
– Julian Peller
Nov 22 at 4:36
I am sorry I didn't make the question clear enough. I am looking at the code from github.com/keras-team/keras/blob/master/examples/…. It visualizes some filters, but it uses model.inputs to get a gradient: grads = K.gradients(loss, input_img)[0] But it does not initialize the model input, therefore I am confused is there any default input of keras VGG16 model
– cwl6
Nov 22 at 3:41
I am sorry I didn't make the question clear enough. I am looking at the code from github.com/keras-team/keras/blob/master/examples/…. It visualizes some filters, but it uses model.inputs to get a gradient: grads = K.gradients(loss, input_img)[0] But it does not initialize the model input, therefore I am confused is there any default input of keras VGG16 model
– cwl6
Nov 22 at 3:41
Hi @cwl6. I think this is a pretty different question from the previous one and my answer addressed the previous one extensively. You can always open a new question and the community will be glad to help!. Regarding this question, you can see on line
91
that it's actually initalizing a random noise image with the line input_img_data = np.random.random((1, 3, img_width, img_height))
., with shape (1, 224, 224, 3)
, which fits my explanation. This input is plugged to the model in line 98
using the definition of iterate
done in 84
.– Julian Peller
Nov 22 at 4:05
Hi @cwl6. I think this is a pretty different question from the previous one and my answer addressed the previous one extensively. You can always open a new question and the community will be glad to help!. Regarding this question, you can see on line
91
that it's actually initalizing a random noise image with the line input_img_data = np.random.random((1, 3, img_width, img_height))
., with shape (1, 224, 224, 3)
, which fits my explanation. This input is plugged to the model in line 98
using the definition of iterate
done in 84
.– Julian Peller
Nov 22 at 4:05
Thank you for your reply. I know it takes your times to answer my question. But in line 78 , it used the model input to create a function. Anyway I will open another new question. Thanks
– cwl6
Nov 22 at 4:20
Thank you for your reply. I know it takes your times to answer my question. But in line 78 , it used the model input to create a function. Anyway I will open another new question. Thanks
– cwl6
Nov 22 at 4:20
input_img
is a placeholder for the model.input
. This is "plugged" to input_img_data
using iterate
on line 84
, where iterate
is a K.function which maps inputs (input_img_data
) to outputs (the model gradients and losses). The connection between input_img_data
and input_img
happens there, saying to iterate to take input_img_data
as input to obtain the gradients and losses of the model, it's implicitly saying to plug the input_img_data
to input_img
and, in turn, to model.inputs
. I'm not super confident with k– Julian Peller
Nov 22 at 4:27
input_img
is a placeholder for the model.input
. This is "plugged" to input_img_data
using iterate
on line 84
, where iterate
is a K.function which maps inputs (input_img_data
) to outputs (the model gradients and losses). The connection between input_img_data
and input_img
happens there, saying to iterate to take input_img_data
as input to obtain the gradients and losses of the model, it's implicitly saying to plug the input_img_data
to input_img
and, in turn, to model.inputs
. I'm not super confident with k– Julian Peller
Nov 22 at 4:27
Now, I'm already researching :P ... I'm pretty noob with Keras too. The idea is the following: basically, you are just defining the structure of the model with
model.input
as the input until you execute a line saying: "ok, this is the input". This line is line 98
. This line (loss_value, grads_value = iterate([input_img_data])
) says: compute loss
and grads
running the model defined before with input_imag_data
as the input. iterate
is the connection between both.– Julian Peller
Nov 22 at 4:36
Now, I'm already researching :P ... I'm pretty noob with Keras too. The idea is the following: basically, you are just defining the structure of the model with
model.input
as the input until you execute a line saying: "ok, this is the input". This line is line 98
. This line (loss_value, grads_value = iterate([input_img_data])
) says: compute loss
and grads
running the model defined before with input_imag_data
as the input. iterate
is the connection between both.– Julian Peller
Nov 22 at 4:36
|
show 1 more comment
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53395427%2fkeras-what-is-model-inputs-in-vgg16%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
These are dimensions of your image. A shape of (1,224,224,3) means you have 1 image (I might be wrong on this one), with both height and width of 224 pixels, and 3 channels (RGB).
– Xiaoyu Lu
Nov 20 at 21:26