5 年之前 · 7f2b461110
--- a/players/08-mimic-the-dealer/run.sh
+++ b/players/08-mimic-the-dealer/run.sh
@@ -1,4 +1,4 @@
 rm -f d2p p2d
 mkfifo d2p p2d
 gawk -f mimic-the-dealer.awk < d2p > p2d &
 ../../blackjack -n100000 > d2p < p2d 
 blackjack -n1e5 > d2p < p2d 
--- a/players/20-basic-strategy/README.m4
+++ b/players/20-basic-strategy/README.m4
@@ -0,0 +1,133 @@
 define(case_title, Derivation of the basic strategy)
 ---
 title: case_title
 ...

 # case_title

 > Difficulty: case_difficulty/100

 ## Quick run

 Execute the `run.sh` script. It should take a few minutes:

 ```terminal
 $ ./run.sh
 h20-2 (10 10)   8.0e+04 +63.23  (1.1)   -171.17 (1.1)   -85.32  (0.5)   stand
 h20-3 (10 10)   8.0e+04 +64.54  (1.1)   -171.50 (1.1)   -85.50  (0.5)   stand
 h20-4 (10 10)   8.0e+04 +65.55  (1.1)   -170.33 (1.1)   -85.50  (0.5)   stand
 h20-5 (10 10)   8.0e+04 +66.65  (1.1)   -171.25 (1.1)   -85.51  (0.5)   stand
 h20-6 (10 10)   8.0e+04 +67.80  (1.1)   -171.07 (1.1)   -85.59  (0.5)   stand
 h20-7 (10 10)   8.0e+04 +77.44  (1.1)   -170.53 (1.1)   -85.44  (0.5)   stand
 h20-8 (10 10)   8.0e+04 +79.11  (1.1)   -170.08 (1.1)   -85.02  (0.6)   stand
 h20-9 (10 10)   8.0e+04 +75.77  (1.1)   -170.31 (1.1)   -84.87  (0.6)   stand
 [...]
 p2-6            8e+04   +24.78  (2.9)   +3.07   (1.0)   yes
 p2-7            8e+04   +1.48   (2.0)   -8.90   (1.0)   yes
 p2-8            8e+04   -17.57  (2.0)   -16.33  (1.0)   uncertain
 p2-8            3e+05   -17.88  (1.0)   -16.10  (0.5)   no
 p2-9            8e+04   -38.73  (2.0)   -24.38  (1.0)   no
 p2-T            8e+04   -54.45  (1.8)   -34.92  (0.9)   no
 p2-A            8e+04   -67.11  (1.5)   -51.59  (0.9)   no
 ```

 A new text file called `bs.txt` with the strategy should be created from scratch:

 ```
 include(bs.txt)dnl
 ```

 ## Full table with results

 The script computes the expected value of each combination

 1. Player’s hand (hard, soft and pair)
 2. Dealer upcard
 3. Hit, double or stand (for hard and soft hands) and splitting or not (for pairs)
 
 The results are given as the expected value in percentage with the uncertainty (one standard deviation) in the last significant digit.
 
 define(table_head,
 <thead>
  <tr>
   <th class="text-center" width="10%" colspan="2">Hand</th>
   <th class="text-center" width="9%">2</th>
   <th class="text-center" width="9%">3</th>
   <th class="text-center" width="9%">4</th>
   <th class="text-center" width="9%">5</th>
   <th class="text-center" width="9%">6</th>
   <th class="text-center" width="9%">7</th>
   <th class="text-center" width="9%">8</th>
   <th class="text-center" width="9%">9</th>
   <th class="text-center" width="9%">T</th>
   <th class="text-center" width="9%">A</th>
  </tr>
 </thead> 
 )
 
 ```{=html}
 <table class="table table-sm table-responsive table-hover small w-100">
 table_head
 <tbody> 
 include(pair.html)
 </tbody>
 table_head
 <tbody> 
 include(soft.html)
 </tbody>
 table_head
 <tbody> 
 include(hard.html)
 </tbody>
 </table>
 ```

 include(table.md)

 ## Detailed explanation

 We want to derive the basic strategy from scratch, i.e. without making any assumption. What we are going to do is to play a large (more on what _large_ means below) number of hands by fixing our first two cards and the dealer upcard and sequentially standing, doubling or hitting the first card. Then we will compare the results for the three cases and select as the proper strategy the best one of the three possible choices.

 Standing and doubling are easy plays, because after we stand or double down then the dealer plays accordingly to the rules: she hits until seventeen, possibly hitting soft seventeen.  But if we hit on our hand, we might need to make another decision whether to stand or hit again. As we do not want to assume anything, we have to play in such an order that if we do need to make another decision, we already know which is the best one. 

 ### Hard hands

 So we start by arranging the shoe so that the user gets hard twenty (i.e. two faces) and the dealer gets successively upcards of two to ace. So we play each combination of dealer upcard (ten) three times each playing either

 1. always standing
 2. always doubling
 3. always hitting
 
 In general the first two plays are easy, because the game stops either after standing or after receiving only one card. The last one might lead to further hitting, but since we are starting with a hard twenty, that would either give the player twenty one or a bust. In any case, the game also ends.
 So we play a certain number of hands (say one thousand hands) each of these three plays for each of the ten upcard faces and record the outcome. The correct play for hard twenty against each of the ten upcards is the play that gave the better result, which is of course standing.

 Next, we do the same for a hard nineteen. In this case, the hitting play might not end after one card is drawn (i.e. we hit on nineteen and get and ace). But if that was the case, we would already know what the best play is from the previous step so we play accordingly and we stand. Repeating this procedure down to hard four we can build the basic strategy table for any hard total against any dealer upcard.

 ### Soft hands

 We can now switch to analyze soft hands. Starting from soft twenty (i.e. an ace and a nine) we do the same we did for the hard case. The only difference is that when hitting, we might end either in another soft hand which we would already analyzed because we start from twenty and go down, or in a hard hand, which we also already analyzed so we can play accordingly.

 ### Pairs

 When dealing with pairs, we have to decide whether to split or not. When we do not split, we end up in one of the already-analyzed cases: either a soft twelve of any even hard hand. When we split, we might end in a hard or soft hand (already analyzed) or in a new pair. But since the new pair can be only the same pair we are analyzing, we have to treat it like we treated the first pair: either to split it or not, so we know how to deal with it.  

 ### Number of hands

 The output is the expected value\ $e$ of the bankroll, which is a random variable with an associated uncertainty\ $\Delta e$ (i.e. a certain numbers of standard deviations). For example, if we received only blackjacks, the expected value would be 1.5 (provided blackjacks pay\ 3 to\ 2 of course). If we busted all of our hands without doubling or splitting, the expected value would be -1. In order to say that the best strategy is, let’s say stand and not hitting or doubling down, we have to make sure that $e_h-\Delta e_h > e_s+\Delta e_s$ and $e_h-\Delta e_h > e_d+\Delta e_d$. If there is no play that can give a better expected value than the other two taking into account the uncertainties, then we have to play more hands in order to reduce the random uncertainty.


 ## Implementation

 The steps above can be written in a [Bash](https://en.wikipedia.org/wiki/Bash_%28Unix_shell%29) script that

 * loops over hands and upcards,
 * creates a strategy file for each possible play hit, double or stand (or split or not),
 * runs [Libre Blackjack](https://www.seamplex.com/blackjack),
 * checks the results and picks the best play,
 * updates the strategy file

 ```bash
 include(run.sh)
 ```

 case_nav
--- a/players/20-basic-strategy/html_cell.awk
+++ b/players/20-basic-strategy/html_cell.awk
@@ -0,0 +1,35 @@
 #function abs(v) {return v < 0 ? -v : v}
 function ceil(x, y){y=int(x); return(x>y?y+1:y)}
 {
 ev=1e-2*$1
 error=1e-2*$2

 if (ev < -1)
   x=-1
 else if (ev > 1)
   x=1
 else
   x=ev
   
 #  r=0.5-0.5*x
 #  g=0.5+0.5*x
 #  b=1-abs(x)

 pi = atan2(0, -1)
 r=cos((x+1)*pi/4)
 g=cos((x-1)*pi/4)
 b=0.4+0.2*cos(x*pi/2)

 
 if (error < 1e-6) {
   error=1e-4;
 }
 
 precision = (ceil(-log(error)/log(10)))-2;
 
 printf("<div class=\"text-center %s\" style='background-color: rgb(%d,%d,%d)'>", (ev<0)?"text-white":"", 255*r, 255*g, 255*b);
 printf(sprintf("%%+.%df", precision), 100*ev);
 printf("(%.0g)", 10^(precision+2) * error);
 printf("</div>");
 
 }
--- a/players/20-basic-strategy/options.conf
+++ b/players/20-basic-strategy/options.conf
@@ -1,5 +1,6 @@
 decks = -1
 decks = 0         # infinite decks
 flat_bet = 1
 no_insurance = true
 &error_standard_deviations = 2
 ; hit_soft_17 = 0
 ; rng_seed = 1
--- a/players/20-basic-strategy/run.sh
+++ b/players/20-basic-strategy/run.sh
@@ -1,8 +1,18 @@
 #!/bin/bash

 n_max=9999999
   n0=80000
 n_max=9000000

 for i in grep awk; do
 RED="\033[0;31m"
 GREEN="\033[0;32m"

 BROWN="\033[0;33m"
 MAGENTA="\e[0;35m"
 CYAN="\e[0;36m"

 NC="\033[0m" # No Color

 for i in grep awk printf blackjack; do
 if [ -z "$(which $i)" ]; then
  echo "error: $i not installed"
  exit 1
@@ -18,16 +28,16 @@ declare -A min
 min["hard"]=4   # from 20 to 4 in hards
 min["soft"]=12  # from 20 to 12 in softs

 rm -f hard.html soft.html pair.html
 rm -f table.md hard.html soft.html pair.html

 # --------------------------------------------------------------
 # start with standing
 cp hard-stand.txt hard.txt
 cp soft-stand.txt soft.txt

 cat << EOF > table.md
 | Hand | \$n\$ | Stand | Double | Hit |
 | ---- | ----- | ----- | ------ | --- |
 cat << EOF >> table.md
 |  Hand  |  \$n\$  |  Stand [%]  |  Double [%]  |  Hit [%] |   Play    |
 |:------:|:-----:|:-----------:|:------------:|:--------:|:---------:|
 EOF


@@ -70,12 +80,12 @@ EOF
     upcard_n=$(($upcard))
   fi
 
   n=10000   # start with n hands
   n=${n0}   # start with n0 hands
   best="x"  # x means don't know what to so, so play
   
   while [ "${best}" = "x" ]; do
    # tell the user which combination we are trying and how many we will play
    echo -ne "${t}${hand}-${upcard} ($card1 $card2)\t"$(printf %.0e ${n})
    echo -ne "${t}${hand}-${upcard} ($card1 $card2)\t"$(printf %.1e ${n})
    for play in s d h; do
     
     # start with options.conf as a template and add some custom stuff
@@ -147,9 +157,9 @@ EOF
    
    if [ ${n} -le ${n_max} ]; then 
     # if we still have room, take into account errors
     error_s=$(echo ${error[${t}${hand},${upcard},s]} | awk '{printf("%+.1f", 100*$1)}')
     error_d=$(echo ${error[${t}${hand},${upcard},d]} | awk '{printf("%+.1f", 100*$1)}')
     error_h=$(echo ${error[${t}${hand},${upcard},h]} | awk '{printf("%+.1f", 100*$1)}')
     error_s=$(echo ${error[${t}${hand},${upcard},s]} | awk '{printf("%.1f", 100*$1)}')
     error_d=$(echo ${error[${t}${hand},${upcard},d]} | awk '{printf("%.1f", 100*$1)}')
     error_h=$(echo ${error[${t}${hand},${upcard},h]} | awk '{printf("%.1f", 100*$1)}')
    else
     # instead of running infinite hands, above a threshold asume errors are zero
     error_s=0
@@ -163,35 +173,51 @@ EOF
   
    if   (( $(echo ${ev_s} ${error_s} ${ev_d} ${error_d} | awk '{print (($1-$2) > ($3+$4))}') )) &&
         (( $(echo ${ev_s} ${error_s} ${ev_h} ${error_h} | awk '{print (($1-$2) > ($3+$4))}') )); then
         
     best="s"
     echo -e "\tstand"
     color=${BROWN}
     best_string="stand"
     
    elif (( $(echo ${ev_d} ${error_d} ${ev_s} ${error_s} | awk '{print (($1-$2) > ($3+$4))}') )) &&
         (( $(echo ${ev_d} ${error_d} ${ev_h} ${error_h} | awk '{print (($1-$2) > ($3+$4))}') )); then
         
     best="d"
     echo -e "\tdouble"
     color=${CYAN}
     best_string="double"
     
    elif (( $(echo ${ev_h}-${error_h} ${ev_s} ${error_s} | awk '{print (($1-$2) > ($3+$4))}') )) &&
         (( $(echo ${ev_h}-${error_h} ${ev_d} ${error_d} | awk '{print (($1-$2) > ($3+$4))}') )); then
         
     best="h"
     echo -e "\thit"
     color=${MAGENTA}
     best_string="hit"
         
    else
    
     best="x"
     n=$((${n} * 10))
     echo -e "\tuncertain"
     color=${NC}
     best_string="uncertain"
     
     n=$((${n} * 4))
     
    fi
    
    echo -e ${color}"\t"${best_string}${NC}
    
   done

   strategy[${t}${hand},${upcard}]=${best}
   
   
   
 #    echo "| ${t}${hand}-${upcard} | ${n} | ${ev_s} (${error_s}) | ${ev_h} (${error_h}) | ${ev_d} (${error_d}) |" >> table.md
 #    
 #    echo " <!-- ${upcard} -->" >> ${type}.html
 #    echo " <td>" >> ${type}.html
 #    echo ${ev_s} ${error_s} | awk -f cell.awk >> ${type}.html
 #    echo ${ev_h} ${error_h} | awk -f cell.awk >> ${type}.html
 #    echo ${ev_d} ${error_d} | awk -f cell.awk >> ${type}.html
 #    echo " </td>" >> ${type}.html
   echo "| ${t}${hand}-${upcard} | $(printf %.1e ${n}) | ${ev_s} (${error_s}) | ${ev_h} (${error_h}) | ${ev_d} (${error_d}) | ${best_string} | " >> table.md
    
   echo " <!-- ${upcard} -->" >> ${type}.html
   echo " <td>" >> ${type}.html
   echo ${ev_s} ${error_s} | awk -f html_cell.awk >> ${type}.html
   echo ${ev_h} ${error_h} | awk -f html_cell.awk >> ${type}.html
   echo ${ev_d} ${error_d} | awk -f html_cell.awk >> ${type}.html
   echo " </td>" >> ${type}.html
   
   
   # save the strategy again with the best strategy
@@ -213,7 +239,7 @@ EOF
   done
  done
  
  echo "</tr>" >> ${type}.html
 #   echo "</tr>" >> ${type}.html
  
 done
 done
@@ -222,8 +248,8 @@ done
 cat << EOF >> table.md


 | Hand | \$n\$ |  Yes  |  No  |
 | ---- | ----- | ----- | ---- |
 |  Hand  |  \$n\$  |   Yes [%]  |   No [%]   |
 |:------:|:-------:|:----------:|:----------:|
 EOF

 # --------------------------------------------------------------------
@@ -259,7 +285,7 @@ for hand in A T $(seq 9 -1 2); do
    upcard_n=$(($upcard))
  fi
 
  n=10000    # start with n hands
  n=${n0}   # start with n0 hands
  best="x"  # x means don't know what to so, so play
   
  while [ "${best}" = "x" ]; do
@@ -328,8 +354,8 @@ EOF
   
   if [ $n -le ${n_max} ]; then 
    # if we still have room, take into account errors
    error_y=$(echo ${error[${t}${hand},${upcard},y]} | awk '{printf("%+.1f", 100*$1)}')
    error_n=$(echo ${error[${t}${hand},${upcard},n]} | awk '{printf("%+.1f", 100*$1)}')
    error_y=$(echo ${error[${t}${hand},${upcard},y]} | awk '{printf("%.1f", 100*$1)}')
    error_n=$(echo ${error[${t}${hand},${upcard},n]} | awk '{printf("%.1f", 100*$1)}')
   else
    # instead of running infinite hands, above a threshold asume errors are zero
    error_y=0
@@ -340,25 +366,37 @@ EOF
   echo -ne "\t${ev_n}\t(${error_n})"
   
   if   (( $(echo ${ev_y} ${error_y} ${ev_n} ${error_n} | awk '{print (($1-$2) > ($3+$4))}') )); then
   
    best="y"
    echo -e "\tyes"
    color=${GREEN}
    best_string="yes"
    
   elif (( $(echo ${ev_n} ${error_n} ${ev_y} ${error_y} | awk '{print (($1-$2) > ($3+$4))}') )); then
   
    best="n"
    echo -e "\tno"
    color=${RED}
    best_string="no"
   
   else
   
    best="x"
    n=$((${n} * 10))
    echo -e "\tuncertain"
    color=${NC}
    best_string="uncertain"
    
    n=$((${n} * 4))
    
   fi
   
   echo -e ${color}"\t"${best_string}${NC}
  done

  echo "| ${t}${hand}-${upcard} | ${n} | ${ev_y} (${error_y}) | ${ev_n} (${error_n}) |" >> table.md
  echo "| ${t}${hand}-${upcard} | $(printf %.1e ${n}) | ${ev_y} (${error_y}) | ${ev_n} (${error_n}) | ${best_string} | " >> table.md
  
 #   echo " <!-- ${upcard} -->" >> ${type}.html
 #   echo " <td>" >> ${type}.html
 #   echo ${ev_y} ${error_y} | awk -f cell.awk >> ${type}.html
 #   echo ${ev_n} ${error_n} | awk -f cell.awk >> ${type}.html
 #   echo " </td>" >> ${type}.html
  echo " <!-- ${upcard} -->" >> ${type}.html
  echo " <td>" >> ${type}.html
  echo ${ev_y} ${error_y} | awk -f html_cell.awk >> ${type}.html
  echo ${ev_n} ${error_n} | awk -f html_cell.awk >> ${type}.html
  echo " </td>" >> ${type}.html
  
  
  strategy[${t}${hand},${upcard}]=${best}
@@ -377,7 +415,7 @@ done

 
 cat header.txt hard.txt header.txt soft.txt header.txt pair.txt > bs.txt
 rm -f blackjack.conf
 rm -f hard.txt soft.txt pair.txt blackjack.conf
 if [ "${debug}" == "0" ]; then
 rm -f *.yaml
 rm -f *.str
--- a/players/README.md
+++ b/players/README.md
@@ -6,9 +6,9 @@ title: Example players for LibreBlackjack

 The subdirectory `players` contains some automatic players that play against LibreBlackjack. These players are coded in different languages and communicate with LibreBlackjack in a variety of ways in order to illustrate the design basis:

 * [`00-internal`](00-internal) uses the internal player that defaults to playing one million hands of basic strategy
 * [`00-internal`](00-internal) uses the internal player (coded in C++) that defaults to playing one million hands of basic strategy
 * [`02-always-stand`](02-always-stand), using the UNIX tool `yes` this player always says “stand” into the standard output (which is piped to libreblackjack’s standard input) no matter what the cards are
 * [`05-no-bust`](05-no-bust) is a PERL-based player does not bust (i.e. hits if the hard total is less than twelve) that receives tha cards through the standard input but draws or stands using a FIFO to talk back to the dealer
 * [`05-no-bust`](05-no-bust) is a PERL-based player does not bust (i.e. hits if the hard total is less than twelve) that receives the cards through the standard input but draws or stands using a FIFO to talk back to the dealer. There are also implementation in AWK (similar speed) and in a pure Bash script (10x slower)
 * [`08-mimic-the-dealer`](08-mimic-the-dealer) does tha same the dealer do (hits soft seventeens). It is implemented in AWK using two FIFOs.
 * [`20-basic-strategy`](20-basic-strategy) derives the basic strategy from scratch in less than one minute.
 * [`20-basic-strategy`](20-basic-strategy) derives the basic strategy from scratch in less than one minute by calling the internal player successively with different strategy files from a shell script.

--- a/src/conf.cpp
+++ b/src/conf.cpp
@@ -218,8 +218,16 @@ bool Configuration::set(int *value, std::list<std::string> key) {
 }

 bool Configuration::set(unsigned int *value, std::list<std::string> key) {
  // check for negative values
  for (auto it : key) {
    if (exists(*(&it))) {
      
      int tmp = std::stoi(data[*(&it)]);
      if (tmp < 0) {
        std::cerr << "key " << *(&it) << " cannot be negative" << std::endl;
        exit(-1);
      }
      
      *value = std::stoi(data[*(&it)]);
      return true;
    }