pondelok 12. septembra 2016

Developing Widelands's (strategic game) AI with genetic / evolutionary algorithm - Part 2

Developing Widelands's (strategic game) AI with genetic / evolutionary algorithm - Part 2


This part II. will cover following topics:

* New inheritance logic
* New type of neurons
* Automated testing
* Some statistics

New inheritance logic

After some thinking current generation of new DNA of AI players look like this:

There are 4 distinct DNA sets in the code, and new AI player takes genes randomly from all of them and performs mutation with probability about 1/15 afterwards.

Moreover each of 4 input DNAs are linked to test map, I picked the maps so that they are different and covers various conditions (limited space, snow, green land). These four maps are used for testing and DNA of best AI player from particular map is copied to the source code and used for generation of new players in next iteration of test. For more info see Automated Testing section

Consideration:
Is random inheritance from 4 parents evenly the best approach?

New types of neurons:

I felt a need for more complicated neurons, that have two inputs and output is based on its combination, so I came up with "bi_neurons", that receives two bools and with probability of 1/4 returns "weight", otherwise 0. Idea is f.e. scenario:
- if my strength is growing 
- and enemy's strength is decreasing
-> return some value
(otherwise return 0)

Keep it mind, the AI is to be blackbox and I am not to say which of possible combinations:

  • me up, enemy down
  • me up, enemy up
  • me down, enemy up
  • me down, enemy down

should return a value and what value it should be. This is the purpose of "evolution" to figure it out. The value is than used as a part of score when deciding about attacking the enemy.

Automated testing

After some struggling with widelands ability to launch "unattended" games, I come up with this setup:


  • I have 4 scripts, and each one runs games in loop - using the same map (4 maps are used for testing), writes logs to files (file names are unique, of  course)
  • I was not able to redirect it to Xvfb, so the games pops up on screen, so I decided to run test while I am not on PC (at night f.e.)
  • Lua scripts are terminating a every game after some gametime (2 - 4 hours, depending on map size), that in real time means 10 - 30 minutes of game
  • duration of "test session" is about 4 hours, so I am getting 10 - 30 games per map during "session"
  • All 4 scripts are running concurently (that means 4 games concurently)
  • I have python script that goes over acquired logs and pick player with higher score and prints its DNA, so I can easily copy it to cc file and recompile


Some statistics

As I probably mentioned, I am using score to evaluate "fitness" of AI player and it is also used to find out if best AI player is better that best AI player from previous session. Usually it is so and with each iteration best score goes up by 10% (evaluated per map).
I have to say that spread of scores per map in a session is very big, f.e. on map Elven Forests I right now have these results (from 27 games from last test session):

Winner score:   1814
Average score:  910
Lowest score:   223

For this stage of development the variance is no issue, but for final version AI will have to provide more consistent performance.


Part I.

utorok 23. augusta 2016

Developing Widelands's AI with genetic / evolutionary algorithm

Developing Widelands's (strategic game) AI with genetic / evolutionary algorithm

Disclaimer: This is experimental work and might never make it into release.

Content of this blogspot:

* Introduction of widelands
* Current status of AI in (pre-)B19
* Idea
* Neuron
* AI training

Introduction of Widelands

Widelands is Settlers II clone, open source strategy game, available for all platforms. Please visit widelands homepage for more info.


Current status of AI in (pre-)B19

I significantly reworked AI in 2015 & 2016, this is between releases 18 and 19. Size of code (as located in src/ai/) grew significantly, 2-fold or more. In addition to new functionality and logic, the code contains a lot of numbers, see at this chunk of code (no need to study it deeply):

// score here is a compound of various input values
// usually resources in vicinity, but when enemy is nearby
// additional bonus is added
if (bf->enemy_nearby) {
prio += bf->military_loneliness / 3;
prio += (20 - bf->area_military_capacity) * 10;
prio -= bo.build_material_shortage * 50;
prio -= (bf->military_in_constr_nearby + bf->military_unstationed) * 50;
} else {
if (bf->near_border) {
prio += 50;
prio -= bo.build_material_shortage * 150;
} else {
prio -= bo.build_material_shortage * 500;  // prohibitive
}
prio -= (bf->military_in_constr_nearby + bf->military_unstationed) * 150;
prio += (5 - bf->own_military_sites_nearby_()) * 15;
}
prio += bf->unconnected_nearby * 50;
prio += bf->unowned_land_nearby * resource_necessity_territory_ / 100;
prio += bf->unowned_mines_spots_nearby * resource_necessity_mines_ / 100;
prio +=
  ((bf->unowned_mines_spots_nearby > 0) ? 35 : 0) * resource_necessity_mines_ / 100;
prio += bf->rocks_nearby / 2;
prio += bf->water_nearby;
prio += bf->distant_water * resource_necessity_water_needed_ / 100;
prio += bf->military_loneliness / 10;
prio += bf->trees_nearby / 3;
if (bf->portspace_nearby == ExtendedBool::kTrue) {
if (num_ports == 0) {
prio += 100;
} else {
prio += 25;
}
}


First a broad view: this is part of construct_building() function, which in addition to many other things, goes over empty spaces ("buildable fields") and buildable buildings and scores all combinations. The "prio" is a score for building x field. This chunk of code is for military sites.

As you can notice a lot of variables and a lot of numbers (perhaps we can use word WEIGHT) are employed here.

Idea

Such development and tweaking is very time consuming. You need to compile - run - watch (game and logs) - re-think - edit - compile and again and again. As now there is a lot of discussion about AI (mainly neural networks) everywhere, so I decided that it would be very convenient if I could use some AI algorithm to simplify the development.

Neural networks - no go for now

First where I looked at was neural networks. I have no personal experience with this, but based on what I read on internet, it would be very complicated. At least memory requirements would be huge. Consider map of 512x512, each field having multiple attributes, about 50 types of buildings, roads between them, dependencies (material flows) between them, warfare aspects. Looks like a huge task. Moreover it seems to me that no implementation of neural network for a game like this exists yet.

Lightweight approach

In fact I don't think it is necessary to dump and scrap current AI, but we can use lightweight approach to use some AI technique to help with only small portion of AI. This is what I decided for.

If you look at above code snippet, you there is a lot of numbers there. And idea is to use randomness and natural selection to come up with best AI. So in other words, the evolutionary algorithm is not to be used to play the game, but to help pick best parameters for the game's AI.

Neurons

This is inspiration from neural networks.

Two things can be mentioned in relation to NN:

Activation curves

Well, if you look at example above, most variables have linear impact on the score. But when I was reading about neural networks I noticed strong emphasis on non-linearity of activation curve. And this is something that looks very interesting and useful here. But I also realized, that computation of such curves is CPU demanding and CPU use is already a problem here. The solution is mentioned below.

Neurons

Well, my AI uses word NEURON but it misuses it a bit. Neuron in my implementation is just an object the receives inputs in range (0 - 20) and returns outputs in range (-100 - 100). As number of possible outputs is limited to 21, the neuron contains array with all outputs and no calculation is done in realtime.

Following predefined curves are available for neurons:

{   0,  5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100}
{   0,  0, 1,  2,  4, 6, 9, 12, 16, 20, 25, 30, 36, 42, 49, 56, 64, 72, 81, 90, 100}
{   0, 17, 25, 32, 38, 44, 49, 53, 58, 62, 66, 70, 74, 78, 81, 84, 88, 91, 94,7, 100}
{ 100, 97,  94, 91, 88, 84, 81, 78, 74, 70, 66, 62, 58, 53, 49, 44, 38, 32, 15, 17,   0}
{ 100, 95, 90, 85, 80, 75, 70, 65, 60, 55, 50, 45, 40, 35, 30, 25, 20, 15, 10,  5, 0}
{-100,-99, -98,-96,-94,-92,-88,-83,-73,-55,  0, 55, 73, 83, 88, 92,94, 96, 98, 99, 100}

The curve is modified by weight parameter (-100 - 100), that is multiplicator of above curve (and divided by 100), so if we will use weight -20, values from first curve will range from 0 to -20.

What is a bit of work for developer is to trim a variable to get (mostly) in range 0 - 20, if the value is out of range, outputs corresponding to 0 or 20 is returned by neuron. So I have to modify inputs so that range is used as much as possible, but got rarely out of it.

So entire AI can be "dumped" (and used for initialization) as:

const std::vector<int8_t> input_weights =
      { 86,  70, -52,  30, -25,  30,  30,  -5,  15,  80,
        30, -98,  30, -60,  93, -68, -34, -29,  30,  73,
        30, -12,  30, -93,  30,  30,  30,  80,   6,  30,
       -30,   9, -18,   3,  30, -58,  30,  30,  30}

const std::vector<int8_t> input_func =
      {  4,   1,   5,   1,   0,   1,   1,   0,   1,   3,
         1,   1,   1,   4,   1,   2,   3,   5,   1,   2,
         1,   2,   1,   3,   1,   1,   1,   3,   1,   1,
         4,   2,   2,   0,   1,   1,   1,   1,   1}


So it is a bunch of numbers. Every neuron is initialized with a pair taken from both vectors. And the code where these values are used can look like:

field.military_score_ += management_data.neuron_pool[0].get_result_safe(field.military_loneliness / 50);
field.military_score_ += management_data.neuron_pool[4].get_result_safe((field.area_military_capacity + 4) / 5);
field.military_score_ +=
management_data.neuron_pool[8].get_result_safe((field.military_in_constr_nearby + field.military_unstationed));
field.military_score_ +=
management_data.neuron_pool[12].get_result_safe(field.own_military_sites_nearby_());
field.military_score_ +=
management_data.neuron_pool[13].get_result_safe(field.military_stationed);
field.military_score_ +=
management_data.neuron_pool[1].get_result_safe(static_cast<uint8_t>(soldier_status_) * 3);

For sake of simplicity, I will use word DNA when talking about function type and weights used to initialize neurons.

AI training

Well very basic setup for this kind of stuff is to start a game (AI players only), with very high number of AI players with slightly changed "DNA". Afterwards to pick one or two best performers and use their DNA to breed new bunch of AI players for next game.

But here it is not so simple. Number of players is limited to 8, game even in high speed can take about 1 hour. Than copy/pasting of DNA of winner (being dumped into terminal) into source, then compilation is needed and new game must be started and run.

So when AI is initialized, it takes given arrays with DNA information and randomly changes about 10% of neurons. This is done for every computer player separately, of course.

Measuring of performance of AI

Well, the game provides few types of graphs (territory curves, military strength curves), so I can have some rough indication which AI is working best, or at least a better as in previous iteration. But for purposes described lower, I generates simple formula how to measure performance of AI:

score = 3 * military_sites + buildable_fields + 10 * production_sites + 10 * mines + 3 * military_strength;

score = military_sites + buildable_fields + 6 * production_sites + 10 * mines + 2 * military_strength + 4 * lost_soldiers + 8 * warehouses + 5 * my_strength_minus_average_enemies_strength;

In-game DNA mutation

Well, due to low number of players I was looking for ways how to enable modification of DNA of AI players during game. Above I presented a score, this is what I use to watch how AI is coping. Currently the logic is:

if score / score_40_minutes_ago < 1.1:
modify_DNA()

This allow the game itself (without my intervention) to change DNA of weak performers. But until one critical point - when AI starts fight together. Here it is much more difficult to come up with score that would indicate "good enough performance".

Final words

Well this is open-end endeavor - I have no idea if this ever be good enough for release, but it is fun to work on :)

Part II