Old Exams Practice (topics may be different)
Please Log In for full access to the web site.
Note that this link will take you to an external site (https://shimmer.mit.edu) to authenticate, and then you will be redirected back to this page.
These questions are for study purposes. They are from past quizzes, so the content/topics may not necessarily match up. We do not provide solutions, but encourage you to come to office hours or post on piazza if you have any questions.
Topic: Graphs
1-1. If the heuristic, h(n), used to estimate distance in A* underestimates the distance to the destination, A* will still always find an optimal answer. 1-2. If the heuristic, h(n), used to estimate distance in A* overestimates the distance to the destination, A* will still always find an optimal answer. 1-3. Dijkstra's algorithm is a variant of depth-first search. 1-4. In a tree, the depth-first search and breadth-first search algorithms both find the same path between a source and destination node.
Which of the following are true?
Check all that apply.
2-1. This graph is a digraph. 2-2. This graph is a tree. 2-3. This graph is unweighted. 2-4. On this graph, breadth-first search visits fewer distinct nodes than depth-first search when finding the shortest path from node A to node D. 2-5. There is exactly one path between any pair of nodes.
Which of the following are true?
Check all that apply.
3-1. This graph has exactly 1 cycle. 3-2. Breadth-first search can be used to find a path between any pair of nodes in this graph.
4-1. A shortest path problem. 4-2. A minimum spanning tree problem. 4-3. A graph flow problem. 4-4. A graph partition problem.
def dfs_graph_search(graph, source, target, verbose=False):
def dfs_internal(path, shortest):
if verbose:
print('Current DFS path:', path_to_string(path))
last_node = path[-1]
if last_node == target:
if verbose:
print('Path found')
return path
if shortest and len(path) + 1 >= len(shortest):
return None
best_path = None
for next_node in graph.children_of(last_node):
if next_node in path:
continue
new_path = dfs_internal(path + [next_node], shortest)
if new_path:
if not best_path or len(new_path) < len(best_path):
best_path = new_path
if not shortest or len(new_path) < len(shortest):
shortest = new_path
return best_path
return dfs_internal([source], None)
Topic: Regression and CLT
You are given the following functions, click to expand and see them.
def split_data(x_vals, y_vals, frac_training=0.7):
training_size = int(len(x_vals)*frac_training)
training_indices = random.sample(range(len(x_vals)), training_size)
training_x, training_y, test_x, test_y = [], [], [], []
for i in range(len(x_vals)):
if i in training_indices:
training_x.append(x_vals[i])
training_y.append(y_vals[i])
else:
test_x.append(x_vals[i])
test_y.append(y_vals[i])
return (training_x, training_y), (test_x, test_y)
def r_squared(observed, predicted):
error = ((predicted - observed)**2).sum()
mean_error = error / len(observed)
return 1 - mean_error / np.var(observed)import numpy as np
import random
def find_best_fit(x, y, min_deg, max_deg):
""" - x and y are numpy arrays of floats
- x elements are in ascending order
- min_deg and max_deg are ints > 0 with min_deg <= max_deg
The values in y are known to be generated from the values in
x by some process that can be modeled by a polynomial function
plus noise of some degree between min_deg and max_deg, inclusive.
Return the degree of the polynomial that best models the
polynomial function without overfitting. """
# your code here
# For example:
x = [1,2,3,4,5,6,7,8,9,10]
y = [1,4.2,7.9,15.3,27.5,38,51.2,63.1,88,98.9]
print(find_best_fit(x, y, 1, 4)) # prints 2
correlation, which returns a float representing the correlation between two lists. Run your code assuming it exists.
def highest_window_correlation(A, B, w):
""" - A and B are lists of numbers
- w is an int <= min(len(A), len(B))
Returns the highest correlation between two contiguous
windows of length w. One window is over A and the other
over B. Round the highest correlation found to 3 digits.
"""
# your code here
# For example:
A = [1, 2, 3, 4, 5, 6]
B = [3, 4, 2, 5, 7, 8]
print(highest_window_correlation(A, B, 3)) # prints 0.982
A = [1, 2, 3, 4, 5]
B = [2, 4, 6, 8, 10]
print(highest_window_correlation(A, B, 2)) # prints 1.0
3-1: An R^2 value of 0.8 on A means that the model accounts for approximately 80% of the variance in A. 3-2: A negative R^2 value is possible if the model is tested on A. 3-3: Increasing d will never decrease the R^2 value of evaluating the model on A. 3-4: If A is only 60% of the data in some larger dataset, then increasing d will never decrease the R^2 value of evaluating the model on the remaining 40%. 3-5: If A is only 10% of the data in some larger dataset, then increasing d might increase or might decrease the R^2 value of evaluating the model on the remaining 90%. 3-6: A danger of overfitting is that the model captures experimental error in the data.
X is a list of pseudo-random numbers, and that k is a constant. Which of the following are true? Check all that apply.
4-1: If Y = [k**x for x in X], then numpy.polyfit(X, Y, 1) will fit a straight line to the values.4-2: If Y = [k for x in X], then numpy.polyfit(X, Y, 2) will fit an approximately straight line to the values.4-3: numpy.polyval returns a list of the coefficients of a linear regression model.4-4: numpy.polyfit can be used to fit a function of more than one variable.