The following built-in MATLAB functions and commands are permitted for this assignment.
Vector/Matrix
Flow Control
Strings and Character Arrays
Other
Points will be deducted from any programs using functions outside this list.
AI for any part of this problem.function_name.m, where each function_name is listed below.reroll_all
Inputs:
|
hand (1x5) integer β current dice hand (1-6) sorted by rank.
counts (1x6) integer β die face counts.
|
Output:
|
|
Details:
|
βΈ Returns true for all hand locations.
|
βΈ This policy serves as a baseline often used for comparison against smarter strategies.
|
sel = reroll_all([5 4 3 2 1], [1 1 1 1 1 0])
% Returns: sel =
% 1Γ5 logical array
% 1 1 1 1 1
reroll_none
Inputs:
|
hand (1x5) integer β current dice hand (1-6) sorted by rank.
counts (1x6) integer β die face counts.
|
Output:
|
|
Details:
|
βΈ Returns false for all hand locations.
|
βΈ This is another baseline policy for comparison with better policies.
|
sel = reroll_none([6 4 3 2 1], [1 1 1 1 0 1])
% Returns: sel =
% 1x5 logical array
% 0 0 0 0 0
reroll_base
get_dealer_reroll.
Inputs:
|
hand (1x5) integer β current dice hand (1-6) sorted by rank.
counts (1x6) integer β die face counts.
|
Output:
|
|
Details:
|
βΈ If the hand is a straight, keep all dice.
βΈ Else if all dice are singles, reroll only the die with face value 1.
βΈ Else: reroll all singles.
βΈ If your code uses the rank ID, use your
get_rank helper from Homework 3.
βΈ Paste any helper functions used by this program underneath this the function. I should be able to run this without any dependencies.
|
sel = reroll_base([2 2 6 5 3], [0 2 1 0 1 1])
% sel =
% 1x5 logical array
% 0 0 1 1 1
reroll_greedy
Inputs:
|
hand (1x5) integer β current dice hand (1-6) sorted by rank.
counts (1x6) integer β die face counts.
|
Output:
|
|
Details:
|
βΈ This policy only keeps the dice with the highest multiplicity (the most of). In the case of two-pair, this policy only keeps the highest pair.
Paste any helper functions used by this program underneath this the function. I should be able to run this without any dependencies.
|
sel = reroll_greedy([5 5 3 3 2], [0 1 2 0 2 0])
% sel =
% 1x5 logical array
% 0 0 1 1 1
reroll_singles
Inputs:
|
hand (1x5) integer β current dice hand (1-6) sorted by rank.
counts (1x6) integer β die face counts.
|
Output:
|
|
Details:
|
Paste any helper functions used by this program underneath this the function. I should be able to run this without any dependencies.
|
sel = reroll_singles([3 3 6 4 1], [1 0 2 1 0 1])
% sel =
% 1x5 logical array
% 0 0 1 1 1
reroll_str8_chaser
Inputs:
|
hand (1x5) integer β current dice hand (1-6) sorted by rank.
counts (1x6) integer β die face counts.
|
Output:
|
|
Details:
|
βΈ Determine how many dice are missing from a low straight (1-5) and a high straight (2-6) and shoot for the version that the hand is closer to.
βΈ If both versions are tied, defer to the higher straight. If both version are more than 3 dice away, then fall back to
reroll_greedy.
βΈ For consistency, when there are multiples of a die face, reroll the dice to the right of the first one. See example 2 below.
Paste any helper functions used by this program underneath this the function. I should be able to run this without any dependencies.
|
sel = reroll_str8_chaser([6 6 5 4 1], [1 0 0 1 1 2])
% sel =
% 1x5 logical array
% 0 1 0 0 1
sel = reroll_str8_chaser([4 4 4 4 1], [1 0 0 4 0 0])
% sel =
% 1x5 logical array
% 0 1 1 1 0
apply_policy
Inputs:
|
hand (1x5) integer β current dice values (faces 1-6).
policy (function handle) β reroll selection policy.
|
Outputs:
|
hand (1x5) integer β final sorted hand after applying the policy & rerolling.
|
Details:
|
βΈ This mimics the dealerβs steps after the initial roll and before you determine a winner in your play_one_round helper from Homework 4.
|
Helpers:
|
[newHand, rid] = apply_policy([2 2 3 5 6], @reroll_none);
% Returns:
% newHand = 2 2 6 5 3 β¬
Sorted
% rid = 7
[newHand, rid] = apply_policy([1 4 4 6 2], @reroll_greedy);
% Returns (values will vary):
% newHand = 4 4 4 4 1
% rid = 2
[newHand, rid] = apply_policy([1 2 3 5 6], @reroll_base);
% Returns (values will vary):
% newHand = 5 4 3 2 1
% rid = 3
mc_prob_dice_poker_rerolls
mc_prob_dice_poker_rolls function from homework 5, except it depends on the policy used to select which dice to reroll.
Inputs:
|
policy (function handle) β selection policy with signature \(\texttt{sel = policy(roll, counts)}\text{,}\) returning a 1x5 logical vector indicating which dice to reroll. Default: @reroll_none (if policy is omitted).
seed (1x1) integer β random number generator seed.
|
Outputs:
|
simEstimates (1x1) struct containing the following fields:
where each confidene interval is formatted as β\(pΜ%\) Β± \(ME%\)β.
|
Details:
|
βΈ For each Monte-Carlo Loop:
βΈ Hard code the confidence levels to 95% and use your
pHat_marginErr_w_CL to find the margin of error for each estimate.
βΈ Results are formatted as percentages with one decimal place for
pHat and two decimals for ME. Use round to set the decimals.
|
simEstimates = mc_prob_dice_poker_rerolls();
% Returns (values will vary):
% struct with fields:
%
% fiveKind: "0.1% Β± 0.02%"
% fourKind: "1.9% Β± 0.08%"
% straight: "3% Β± 0.11%"
% fullhouse: "3.8% Β± 0.12%"
% threeKind: "15.5% Β± 0.22%"
% twoPair: "23% Β± 0.26%"
% onePair: "46.4% Β± 0.31%"
% singles: "6.3% Β± 0.15%"
simEstimates = mc_prob_dice_poker_rerolls(@reroll_greedy, 2e5, 123);
% Returns:
% struct with fields:
%
% fiveKind: "1.1% Β± 0.05%"
% fourKind: "10.1% Β± 0.13%"
% straight: "3% Β± 0.08%"
% fullhouse: "14.7% Β± 0.16%"
% threeKind: "23.7% Β± 0.19%"
% twoPair: "28.4% Β± 0.2%"
% onePair: "12.8% Β± 0.15%"
% singles: "6.2% Β± 0.11%"
mc_prob_policy_wins
policyA) against another (policyB) in simulated 5-dice poker matches. Each trial generates random starting hands, applies each policy once, compares the resulting hands, and tracks the wins.
Inputs:
|
policyA (function handle) β selection policy for player A.
policyB (function handle) β selection policy for player B.
seed (1x1) integer β optional random number seed.
|
Outputs:
|
|
Details:
|
βΈ For each Monte-Carlo Loop:
βΈ Count total wins.
βΈ Compute the sample win probability for
policyA.
βΈ Use 95% confidence levels for the margin of error.
|
Helpers:
|
[pHat, ME] = mc_prob_policy_wins(@reroll_none, @reroll_greedy, 2e5, 314)
% Returns:
% pHat = 0.3024
% ME = 0.0020
[pHat, ME] = mc_prob_policy_wins(@reroll_base, @reroll_str8_chaser)
% Returns (values will vary):
% pHat = 0.7389
% ME = 0.0027
head2head_results
Inputs:
|
None.
|
Outputs:
|
probs (M x M) double matrix β head-to-head win probabilities in percent, where probs(A,B) gives the estimated chance that policy A wins against policy B.
Each value is rounded to one decimal place (e.g., \(64.7\) represents a 64.7% win rate).
|
Details:
|
βΈ Define a structure of six standard policies, mapping names to their function handles:
βallβ β @reroll_allβnoneβ β @reroll_noneβbaseβ β @reroll_baseβsinglesβ β @reroll_singlesβgreedyβ β @reroll_greedyβstr8β β @reroll_str8_chaser
βΈ To create the probability matrix, use a nested for loop.
βΈ Both loops should scan through the string array of the fields in the policy structure above. You can hard code this string array or extract them from the structure with
policyNames = string(fields(policies)).
βΈ Initialize an
M x M matrix of zeros to store the estimated probabilities.
βΈ Convert the probabilities to a percent and round to one decimal place using the
round(*,1) command.
βΈ The diagonal entries represent self-comparison (
policy vs. itself), which should be approximately 50%.
|
Helpers:
|
mc_prob_policy_wins, apply_policy, get_winner, get_face_counts, sort_by_rank, get_rank, and all reroll policy functions.
|
probs = head2head_results()
policyNames = ["all" "none" "base" "singles" "greedy" "str8"];
T = array2table( ...
probs, 'VariableNames', ...
policyNames, 'RowNames', ...
policyNames);
disp(T)
avgWinRate = mean(probs, 2);
[~, idxBest] = max(avgWinRate);
bestPolicy = policyNames(idxBest)
% Returns the policy with the highest mean win rate across all opponents.