Here are the quiz answers for the Coursera course Convolutional Neural Networks. There are 5 courses in Coursera’s Deep Learning Specialization. Here are the quiz answers for Course 4 Convolutional Neural Networks.
Convolutional Neural Networks quiz answers to all weekly questions (weeks 1-4):
- Week 1 Foundations of Convolutional Neural Networks quiz answers
- Week 2 Deep Convolutional Models: Case Studies quiz answers
- Week 3 Object Detection quiz answers
- Week 4 Special Applications: Face recognition & Neural Style Transfer quiz answers
You may also be interested in Deep Learning Specialization quiz answers.
Coursera Deep Learning Specialization quiz answers
Neural Networks and Deep Learning Quiz Answers
Improving Deep Neural Networks: Hyperparameter Tuning, Regularization and Optimization Quiz Answers
Structuring Machine Learning Projects Quiz Answers
Convolutional Neural Networks Quiz Answers
In the fourth course of the Deep Learning Specialization, you will understand how computer vision has evolved and become familiar with its exciting applications such as autonomous driving, face recognition, reading radiology images, and more.
By the end, you will be able to build a convolutional neural network, including recent variations such as residual networks; apply convolutional networks to visual detection and recognition tasks; and use neural style transfer to generate art and apply these algorithms to a variety of image, video, and other 2D or 3D data.
The Deep Learning Specialization is our foundational program that will help you understand the capabilities, challenges, and consequences of deep learning and prepare you to participate in the development of leading-edge AI technology. It provides a pathway for you to gain the knowledge and skills to apply machine learning to your work, level up your technical career, and take the definitive step in the world of AI.
Convolutional Neural Networks Week 1 Quiz Answers
Implement the foundational layers of CNNs (pooling, convolutions) and stack them properly in a deep network to solve multi-class image classification problems.
Week 1 Foundations of Convolutional Neural Networks quiz answers
Q1. What do you think applying this filter to a grayscale image will do?
- Detect 45-degree edges
- Detect horizontal edges
- Detect image contrast
- Detect vertical edges
Q2. Suppose your input is a 300 by 300 color (RGB) image, and you are not using a convolutional network. If the first hidden layer has 100 neurons, each one fully connected to the input, how many parameters does this hidden layer have (including the bias parameters)?
- 9,000,100
- 9,000,001
- 27,000,001
- 27,000,100
Q3. Suppose your input is a 300 by 300 color (RGB) image, and you use a convolutional layer with 100 filters that are each 5×5. How many parameters does this hidden layer have (including the bias parameters)?
- 7500
- 2600
- 7600
- 2501
Q4. You have an input volume that is 63x63x16, and convolve it with 32 filters that are each 7×7, using a stride of 2 and no padding. What is the output volume?
- 29x29x16
- 16x16x32
- 16x16x16
- 29x29x32
Q5. You have an input volume that is 15x15x8, and pad it using “pad=2.” What is the dimension of the resulting volume (after padding)?
- 19x19x12
- 19x19x8
- 17x17x10
- 17x17x8
Q6. You have an input volume that is 63x63x16, and convolve it with 32 filters that are each 7×7, and stride of 1. You want to use a “same” convolution. What is the padding?
- 3
- 7
- 2
- 1
Q7. You have an input volume that is 32x32x16, and apply max pooling with a stride of 2 and a filter size of 2. What is the output volume?
- 16x16x16
- 32x32x8
- 15x15x16
- 16x16x8
Q8. Because pooling layers do not have parameters, they do not affect the backpropagation (derivatives) calculation.
- True
- False
Q9. In lecture we talked about “parameter sharing” as a benefit of using convolutional networks. Which of the following statements about parameter sharing in ConvNets are true? (Check all that apply.)
- It reduces the total number of parameters, thus reducing overfitting.
- It allows a feature detector to be used in multiple locations throughout the whole input image/input volume.
- It allows parameters learned for one task to be shared even for a different task (transfer learning).
- It allows gradient descent to set many of the parameters to zero, thus making the connections sparse.
Q10. In lecture we talked about “sparsity of connections” as a benefit of using convolutional layers. What does this mean?
- Each layer in a convolutional network is connected only to two other layers
- Regularization causes gradient descent to set many of the parameters to zero.
- Each activation in the next layer depends on only a small number of activations from the previous layer.
- Each filter is connected to every channel in the previous layer.
Quiz answers for the other 4 courses in Coursera’s Deep Learning Specialization
Course 1 Neural Networks And Deep Learning quiz answers
Course 2 Improving Deep Neural Networks quiz answers
Course 3 Structuring Machine Learning Projects quiz answers
Course 5 Sequence Models quiz answers
Convolutional Neural Networks Week 2 Quiz Answers
Discover some powerful practical tricks and methods used in deep CNNs, straight from the research papers, then apply transfer learning to your own deep CNN.
Week 2 Deep Convolutional Models: Case Studies quiz answers
Q1. Which of the following do you typically see in a ConvNet? (Check all that apply.)
- FC layers in the last few layers
- Multiple CONV layers followed by a POOL layer
- Multiple POOL layers followed by a CONV layer
- FC layers in the first few layers
Q2. In order to be able to build very deep networks, we usually only use pooling layers to downsize the height/width of the activation volumes while convolutions are used with “valid” padding. Otherwise, we would downsize the input of the model too quickly.
- True
- False
Q3. Training a deeper network (for example, adding additional layers to the network) allows the network to fit more complex functions and thus almost always results in lower training error. For this question, assume we’re referring to “plain” networks.
- True
- False
Q4. The following equation captures the computation in a ResNet block. What goes into the two blanks above?
a[l+2]=g(W[l+2]g(W[l+1]a[l]+b[l+1])+bl+2+_______ )+_______
- 0 and z[l+1], respectively
- 0 and a[l], respectively
- z[l] and a[l], respectively
- a[l] and 0, respectively
Q5. Which ones of the following statements on Residual Networks are true? (Check all that apply.)
- Using a skip-connection helps the gradient to backpropagate and thus helps you to train deeper networks
- The skip-connection makes it easy for the network to learn an identity mapping between the input and the output within the ResNet block.
- The skip-connections compute a complex non-linear function of the input to pass to a deeper layer in the network.
- A ResNet with L layers would have on the order of L^2 skip connections in total.
Q6. Suppose you have an input volume of dimension nH x n_WnW x n_Cn
. Which of the following statements you agree with? (Assume that “1×1 convolutional layer” below always uses a stride of 1 and no padding.)
- You can use a 1×1 convolutional layer to reduce n_HnH, n_WnW, and n_CnC.
- You can use a 2D pooling layer to reduce n_HnH, n_WnW, but not n_CnC.
- You can use a 2D pooling layer to reduce n_HnH, n_WnW, and n_CnC.
- You can use a 1×1 convolutional layer to reduce n_CnC but not n_HnH, n_WnW.
Q7. Which ones of the following statements on Inception Networks are true? (Check all that apply.)
- Making an inception network deeper (by stacking more inception blocks together) might not hurt training set performance.
- A single inception block allows the network to use a combination of 1×1, 3×3, 5×5 convolutions and pooling.
- Inception networks incorporate a variety of network architectures (similar to dropout, which randomly chooses a network architecture on each step) and thus has a similar regularizing effect as dropout.
- Inception blocks usually use 1×1 convolutions to reduce the input data volume’s size before applying 3×3 and 5×5 convolutions.
Q8. Which of the following are common reasons for using open-source implementations of ConvNets (both the model and/or weights)? Check all that apply.
- The same techniques for winning computer vision competitions, such as using multiple crops at test time, are widely used in practical deployments (or production system deployments) of ConvNets.
- A model trained for one computer vision task can usually be used to perform data augmentation even for a different computer vision task.
- It is a convenient way to get working with an implementation of a complex ConvNet architecture.
- Parameters trained for one computer vision task are often useful as pretraining for other computer vision tasks.
Q9. In Depthwise Separable Convolution you:
- You convolve the input image with n_cnc number of n_fnf x n_fnf filters (n_cnc is the number of color channels of the input image).
- You convolve the input image with a filter of n_fnf x n_fnf x n_cnc where n_cnc acts as the depth of the filter (n_cnc is the number of color channels of the input image).
- Perform two steps of convolution.
- The final output is of the dimension n_{out}nout x n_{out}nout x n^{‘}_{c}nc′ (where n^{‘}_{c}nc′ is the number of filters used in the previous convolution step).
- Perform one step of convolution.
- For the “Depthwise” computations each filter convolves with all of the color channels of the input image.
- For the “Depthwise” computations each filter convolves with only one corresponding color channel of the input image.
- The final output is of the dimension n_{out}nout x n_{out}nout x n_{c}nc (where n_{c}nc is the number of color channels of the input image).
Q10. Fill in the missing dimensions shown in the image below (marked W, Y, Z).
- W = 30, Y = 30, Z = 5
- W = 30, Y = 20, Z =20
- W = 5, Y = 20, Z = 5
- W = 5, Y = 30, Z = 20
Convolutional Neural Networks Week 3 Quiz Answers
Apply your new knowledge of CNNs to one of the hottest (and most challenging!) fields in computer vision: object detection.
Week 3 Object Detection quiz answers
Q1. You are building a 3-class object classification and localization algorithm. The classes are: pedestrian (c=1), car (c=2), motorcycle (c=3). What should yy be for the image below? Remember that “?” means “don’t care”, which means that the neural network loss function won’t care what the neural network gives for that component of the output. Recall y = [pc,bx,by,bh,bw,c1,c2,c3]
- y = [1, ?, ?, ?, ?, ?, ?, ?]
- y = [1, ?, ?, ?, ?, 0, 0, 0]
- y = [0, ?, ?, ?, ?, ?, ?, ?]
- y = [0, ?, ?, ?, ?, 0, 0, 0]
- y = [?, ?, ?, ?, ?, ?, ?, ?]
Q2. You are working on a factory automation task. Your system will see a can of soft-drink coming down a conveyor belt, and you want it to take a picture and decide whether (i) there is a soft-drink can in the image, and if so (ii) its bounding box. Since the soft-drink can is round, the bounding box is always square, and the soft drink can always appear as the same size in the image. There is at most one soft drink can in each image. Here’re some typical images in your training set:
What is the most appropriate set of output units for your neural network?
- Logistic unit, bx, by, bh (since bw = bh)
- Logistic unit, bx, by, bh, bw
- Logistic unit (for classifying if there is a soft-drink can in the image)
- Logistic unit, bx and by
Q3. If you build a neural network that inputs a picture of a person’s face and outputs N landmarks on the face (assume the input image always contains exactly one face), how many output units will the network have?
- N^2
- 2N
- N
- 3N
Q4. When training one of the object detection systems described in lecture, you need a training set that contains many pictures of the object(s) you wish to detect. However, bounding boxes do not need to be provided in the training set, since the algorithm can learn to detect the objects by itself.
- True
- False
Q5. What is the IoU between these two boxes? The upper-left box is 2×2, and the lower-right box is 2×3. The overlapping region is 1×1.
- None of the above
- ⅙
- 1/9
- 1/10
Q6. Suppose you run non-max suppression on the predicted boxes above. The parameters you use for non-max suppression are that boxes with probability \leq≤ 0.4 are discarded, and the IoU threshold for deciding if two boxes overlap is 0.5. How many boxes will remain after non-max suppression?
- 6
- 7
- 4
- 3
- 5
Q7. Suppose you are using YOLO on a 19×19 grid, on a detection problem with 20 classes, and with 5 anchor boxes. During training, for each image you will need to construct an output volume yy as the target value for the neural network; this corresponds to the last layer of the neural network. (yy may include some “?”, or “don’t cares”). What is the dimension of this output volume?
- 19x19x(5×20)
- 19x19x(20×25)
- 19x19x(5×25)
- 19x19x(25×20)
Q8. What is Semantic Segmentation?
- Locating an object in an image belonging to a certain class by drawing a bounding box around it.
- Locating objects in an image by predicting each pixel as to which class it belongs to.
- Locating objects in an image belonging to different classes by drawing bounding boxes around them.
Q9. Using the concept of Transpose Convolution, fill in the values of X, Y and Z below.
(padding = 1, stride = 2)
Input: 2×2
1 | 2 |
3 | 4 |
Filter: 3×3
1 | 0 | -1 |
1 | 0 | -1 |
1 | 0 | -1 |
Result: 6×6
0 | 1 | 0 | -2 | ||
0 | X | 0 | Y | ||
0 | 1 | 0 | Z | ||
0 | 1 | 0 | -4 | ||
- X = 2, Y = -6, Z = -4
- X = -2, Y = -6, Z = -4
- X = 2, Y = 6, Z = 4
- X = 2, Y = -6, Z = 4
Q10. Suppose your input to an U-Net architecture is hh x ww x 33, where 3 denotes your number of channels (RGB). What will be the dimension of your output?
- D: h x w x n, where n = number of of output channels
- h x w x n, where n = number of filters used in the algorithm
- h x w x n, where n = number of output classes
- h x w x n, where n = number of input channels
Convolutional Neural Networks Week 4 Quiz Answers
Explore how CNNs can be applied to multiple fields, including art generation and face recognition, then implement your own algorithm to generate art and recognize faces!
Week 4 Special Applications: Face recognition & Neural Style Transfer quiz answers
Q1. Face verification requires comparing a new picture against one person’s face, whereas face recognition requires comparing a new picture against K person’s faces.
- True
- False
Q2. Why do we learn a function d(img1, img2)d(img1,img2) for face verification? Select all that apply
- Given how few images we have per person, we need to apply transfer learning.
- This allows us to learn to recognize a new person given just a single image of that person.
- We need to solve a one-shot learning problem.
- This allows us to learn to predict a person’s identity using a softmax output unit, where the number of classes equals the number of persons in the database plus 1 (for the final “not in database” class).
Q3. In order to train the parameters of a face recognition system, it would be reasonable to use a training set comprising 100,000 pictures of 100,000 different persons.
- False
- True
Q4. Which of the following is a correct definition of the triplet loss? Consider that \alpha > 0α>0. (We encourage you to figure out the answer from first principles, rather than just refer to the lecture.)
- max(∣∣f(A)−f(P)∣∣2−∣∣f(A)−f(N)∣∣2−α,0)
- max(∣∣f(A)−f(N)∣∣2−∣∣f(A)−f(P)∣∣2+α,0)
- max(∣∣f(A)−f(P)∣∣2−∣∣f(A)−f(N)∣∣2+α,0)
- max(∣∣f(A)−f(N)∣∣2−∣∣f(A)−f(P)∣∣2−α,0)
Q5. Consider the following Siamese network architecture:
The upper and lower neural networks have different input images, but have exactly the same parameters.
- True
- False
Q6. You train a ConvNet on a dataset with 100 different classes. You wonder if you can find a hidden unit which responds strongly to pictures of cats. (I.e., a neuron so that, of all the input/training images that strongly activate that neuron, the majority are cat pictures.) You are more likely to find this unit in layer 4 of the network than in layer 1.
- True
- False
Q7. Neural style transfer is trained as a supervised learning task in which the goal is to input two images (xx), and train a network to output a new, synthesized image (yy).
- True
- False
Q8. In the deeper layers of a ConvNet, each channel corresponds to a different feature detector. The style matrix G^{[l]}G
[l] measures the degree to which the activations of different feature detectors in layer ll vary (or correlate) together with each other.
- True
- False
Q9. In neural style transfer, what is updated in each iteration of the optimization algorithm?
- The pixel values of the content image C
- The pixel values of the generated image G
- The neural network parameters
- The regularization parameters
Q10. You are working with 3D data. You are building a network layer whose input volume has size 32x32x32x16 (this volume has 16 channels), and applies convolutions with 32 filters of dimension 3x3x3 (no padding, stride 1). What is the resulting output volume?
- Undefined: This convolution step is impossible and cannot be performed because the dimensions specified don’t match up.
- 30x30x30x32
- 30x30x30x16
Related content
A framework for understanding NLP
Compliance frameworks and industry standards
Ethical AI frameworks, initiatives, and resources
Google Data Analytics Professional Certificate quiz answers
Google IT Support Professional Certificate quiz answers
How data flow through the Internet
How to break into information security
IT career paths – everything you need to know
Job roles in IT and cybersecurity
Network security risk mitigation best practices
The GRC approach to managing cybersecurity
The penetration testing process
The Security Operations Center (SOC) career path
Back to DTI Courses
Other content
1st Annual University of Ottawa Supervisor Bullying ESG Business Risk Assessment Briefing
Disgraced uOttawa President Jacques Frémont ignores bullying problem
How to end supervisor bullying at uOttawa
PhD in DTI uOttawa program review
Rocci Luppicini – Supervisor bullying at uOttawa case updates
The case for policy reform: Tyranny
The trouble with uOttawa Prof. A. Vellino
The ugly truth about uOttawa Prof. Liam Peyton
uOttawa engineering supervisor bullying scandal
uOttawa President Jacques Frémont ignores university bullying problem
uOttawa Prof. Liam Peyton denies academic support to postdoc
Updated uOttawa policies and regulations: A power grab
What you must know about uOttawa Prof. Rocci Luppicini
Why a PhD from uOttawa may not be worth the paper it’s printed on
Why uOttawa Prof. Andre Vellino refused academic support to postdoc