Tuesday, May 24, 2011
Saturday, May 21, 2011
So let us look at performance on last time on what seems to be the final version of the isometric rendering engine before "Stoneage". I have some better ideas than the current implementation, but they are a radical departure and I don't want to rewrite it.
I will be using a nonrandom uniform and symmetric map. This may seem like cheating to some, but it is not, since if I get a let's say 10% improvement, that improvement will scale with non random maps too. A static and uniform map will allow me to measure performance gains better.
I will be using mostly pure software rendering, at 1280x720 resolution, with the game compiled in debug mode. Using hardware acceleration delegates a lot of the work to the GPU, and thus it would be harder to see if my changes have an impact on the performance since I can only influence the CPU. But I will measure performance for other setups as well once in a while. Result will be specific for my machine and I will not be using the fastest machine available to me.
While my real problems are related to scroll speed, I will first measure the rendering performance. So let us render just a single layer from our map. Since the map is uniform and symmetrical, it makes little difference what region I am rendering, as long as I keep away from the borders. Half height wall rendering is off, since rendering in such a way is faster. I get a steady 65 FPS. Now two levels: 66. Hmmm... I guess that CPU caching is to blame for the increase, since the game has been running for a while and kept traversing the exact same data structures. Three levels: the same. Ten levels: same. Twenty: still 65, but I keeps spiking up higher, sometimes reaching 70. All Z levels: the same. Good. Everything is working as expected! As I detailed in post a long time ago, the engine does not care about map horizontal sizes, and it only cares a little about the number of Z levels.
Repeating the experiment with Irrlicht more advanced software renderer, we get a steady 50 FPS. On most machines the advanced one will be faster or at least as fast as normal rendering, but my machine is weird.
DirectX 8: 65. Again, my machine is weird
DirectX 9: 183! OK, now we are talking!
OpenGL: 36. I am one of those people for who OpenGL has never worked as good as DirectX. A curse? A blessing? I don't know. This is why I tend to dismiss OpenGL development with a smug smirk on my face, even though I have some plans with OpenGL in the future.
Before we go to the real problem, scrolling speed, let me check if the rendering loop is really as optimized as it should be.
*checks out code* *facepalms* *current system is old and outdated* *fixes it* *does not work* *finds ancient bug in image loading* *fixes it*
Done! I have unified the floor and tile drawing system. There is no noticeable performance gain, but the new system is simpler so it was a worth while change. Rendering will never get any faster (as long the number of tiles remains constant of course). The renderer is basically a single one dimensional for loop containing a single branch. There is basically no way to simplify this. Maybe the branch could be removed, but I doubt that would have an impact on performance since the number of tiles is not that great. And while the number of tiles increases greatly once you increase zoom, in this mode you get between 30 and 40 fps with software rendering.
So now it is time to tackle the real culprit: scroll speed. Why is scroll speed low? Because we cache the map so we can have the renderer be as fast as it is right now. Measuring the scroll speed under default circumstances, we get the values 0, 15 and 16 ms, with zero being the most common. As I said, this is a limitation of the timer that I am using: it has a low resolution. But anyway, 16 ms is a good value. Now let us increase the floor levels to maximum: we get almost every time 16 ms. So in the first case we actually had something around 2-4 ms, and it only reported a non zero value once these small numbers started accumulating. Now we are at around 14-15 ms based on the frequency of the 16 values. Still not bad: scrolling is slightly less smooth, but it is barely noticeable. While keeping a scroll key pressed, FPS drops by about 10, again fully acceptable. But now, let us increase the zoom to maximum: 187-210 ms. Ouch! With such high values, you can not keep the button pressed and FPS drops to around 3 when continuously scrolling.
As a first step I try to optimize the bounds detection algorithms when building the cache. This is the part that tells me how much of the map fits on the screen. In a top down map, a rectangular area from the map is rendered as an rectangular area on screen. But in isometric mode, a rectangular area will result in a diamond shape render on screen, so you need to take a diamond shape area of the map which will render as a roughly rectangular area on screen. Optimizing this, I have gotten to 156-171 ms, so about 30 ms less. A good start.
Then I greatly violated the DRY principle, but only in the deepest depths of my deep code, thus eliminating an extra if per cell.
Now that the the code that determines the area that should be scanned is more efficient, the code that actually does the scanning for visibility should be optimized. Unfortunately, this is as good as it will get and there is no way to optimize it. So I also cached the results of the scan, stuffing the data in some free space in the available cache data structure, avoiding and increase in RAM consumption. This does have the disadvantage that the cache must be updated on wall dig/build operations, but this can be done locally and atomically, so it is not a big deal. New results: 109-125 ms.
There is one single worthwhile experiment: adding a new field to the cache, so the new data does not get inserted into left over space and thus the CPU can access it faster, but increasing the RAM consumption: 94-110 ms. I am not sure about this change. The RAM increase is negligible, but the speed increase is not great. On more test with huge RAM increase: the same, so not worth it. I'll go with solution "a", small RAM increase, small gain.
Well, this is about all I can do. I do not see any way to greatly increase scrolling performance further, but some minor gains could yet be achieved. But going from 187-210 ms to 94-110 ms was definitely worth over half a day of optimizing. The results are great, but not stellar. The best part is that with the game compiled in optimal more (not in debug) and using the default world generation and display values, scrolling at the highest zoom level is pretty snappy. You do get a small snag once in a while, but you can keep the button pressed. Even tripling the number of Z levels leads to less snappy but good results.
I know I am repeating myself, but these changes only count on maximum zoom level. On normal zoom, there was never a problem, but today optimizations have certainly sped up even the normal zoom level scrolling and map updating, thus making the game more playable on even older hardware (theoretically speaking of course. I need to test it out on some netbook before I can claim this with certainty). And of course, if there are still performance issues left, you can always reduce the resolution. I have just tried out 800x600 on highest zoom level with all Z levels visible and it is great, with almost no lag at all and 52 FPS in optimal mode. I switched over to debug mode so I can measure the cache building speed, and it is between 46 and 62 ms. So cache building performance is directly proportional with the resolution.
Thus, version 3.0 of my isometric rendering engine is done. I will not optimize it again until after "Stoneage" (I will fix bugs though) and if you see me tinkering yet again with it please give me a slap to snap me out of it.
Friday, May 20, 2011
I am cleaning up the editor and getting rid of unwanted data. Here I will be recording the data that will go in to the first versions of the game, the data planned and the placeholder data that I am removing. "Release" date is approaching fast, so I need to clean things before you can get your hands on the editor (or not).
First biome will be "Generic forest". Come to think of it, it is not really a forest. It should be called something better, but I'll give the final names once the list is more stable, probably names based on real life climates. It will contain a fair balance of bushes and trees, with a fairly uniform distribution. Tree list: birch, oak, pine, willow. Thea leaves and fruit of these trees lack any special uses, just the general uses. They are not a viable food source. Bush list is not final, but will contain 3-5 plants, 1-2 for alcohol, 1 smokable, 1 usable in clothing industry. And of course these plants will be the primary food source. All plants will be surface plants. As I said, there will be no turtling up deep in a cave in the initial phases. I want you out in the open, working the fields and defending them, while balancing that with an ever increasing underground fortress. Once watter is in the game, willow will prefer wet areas.
"Generic forest" will be the first supported and maintained biome. The second will be "Fruit forest", with an abundance of fruit baring trees and plants having a more industrial role. Third will be "Dense forest", with a lot of coniferous trees, a high area that is an industrial heaven, bot has less sources of food.
Then will have two shrub-lands, on dry, with only a tree here and there, and the second a swamp. And one desert area, with cacti. I have no idea how to make you survive here, and the realism of food generation will probably take a serious hit here.
I am focusing on "Generic forest", which will be little baby's first biome, a fairly generic, balanced and bland one. Te rest of the biomes are a little sketchy, and if time does not permit it, "Stoneage" will only feature this biome and if I can help it, the desert one (because of the challenge).
So: Generic forest, Fruit forest, Dense forest, Shrublands, Swamp, Desert. The list is short but I don't want to launch a bazillion of different biomes with uninteresting differences between them. I want each to have its own feel and own unique list of plants and animals, an I'll keep adding to the list whenever I get a new idea. I am already designing a fantasy biome.
What is removed from the biome list? All DF biomes, including: Mountain (I'll have mountains, but they won't be a biome; adding specificity here is done by height rules), Glacier, Tundra, Temperate Freshwater Swamp, Temperate Saltwater Swamp, Temperate Freshwater Marsh, Temperate Saltwater Marsh, Tropical Freshwater Swamp, Tropical Saltwater Swamp, Mangrove Swamp, Tropical Freshwater Marsh, Tropical Saltwater Marsh, Taiga, Temperate Coniferous Forest, Temperate Broadleaf Forest, Tropical Coniferous Forest, Tropical Dry Broadleaf Forest, Tropical Moist Broadleaf Forest, Temperate Grassland, Temperate Savanna, Temperate Shrubland, Tropical Grassland, Tropical Savanna, Tropical Shrubland, Badlands, Rocky Wasteland, Sand Desert. River and ocean biomes were never included.
I mentioned birch, oak, pine, willow as the trees included in the first biome. Here are a few more trees: acacia, alder, ash, birch, cedar, chestnut, cacao tree, kapok, mango tree, larch, mahogany, maple, mangrove and palm. The list is still missing a few classical fruit trees (apple, pear, orange...) and a few made-up trees that I will be using to balance a biome. Saguaro is the only cactus I have right now, but I'll add 3 more.
I removed all DF specific trees: blood thorn, candlenut, feather tree, fungiwood, glumprong, highwood, black-cap, goblin-cap, nether-cap, rubber tree, spore tree, tower-cap and tunnel tube.
Plant list is in disarray now, I will talk about plants in another post. But the plan is to have a few bush like plants, but also 1-2 grain barring high density plants that you will plant in large farm plots. These plants will appear in small patches here and there, but the sharp contrast between the landscape in its natural form and after dwarf hand intervention that changed it to support your fortress' population is intentional. There will be no mushroom in the first version and nothing can be planted underground or under poor conditions.
The animal list is even worse, but I am not doing animals for "Stoneage". But the domestic animal list will be the same as in DF (those are actually domestic animals), but without mules (do we need to have the difference between a donkey and a mule? I think not) and camels. What I want is that each biome shall have a specific semi-wild but half-trained animal that you can choose preembark if you wish. This animal will be very expensive but provide a biome specific unique advantage. You won't be able to get two without severely gimping your initial supplies, so you will need to capture a mate for it if you want more. Why not just capture a specimen in the early stage without buying one? Because the bought specimen will give a significant boost in domesticating the species .
That is enough about content for today. As you probably noticed from the images, the editor has gone through some changes, but I will talk about those in a next post.
Wednesday, May 18, 2011
Why must you do this to me, WhenMouseLeave / WhenMouseEnter / IsUnderMouse? These methods are extremely hard to get perfect. And even when you think you are done, you get a popup window out of nowhere that does not notify you of a mouse move event or the OS does something strange with the mouse and you are left dangling. Anyway, I solved most of these.
I also solved a huge yet elusive bug, that only happened once the widget nesting reached a depth of over four. It took me hours to find and solve this bug. But everything is starting to come together. The new GUI system is not only a lot more stable and mature when compared to yesterday, but it is also more feature rich. So I removed the old action bar in favor of the new only (but I only commented the code out).
I also made animations a little more FPS agnostic. It is not perfect yet, because the resolution of the timer I am using is not high enough, but it is a big step in the right direction.
Since the bug fixing took me such an extended period of time, in today's post I will only talk about two topics.
Pause screen effect
I added a full screen effect when you pause the game, to differentiate between the two states. The screen becomes darker. But then I realized that this is an idiotic change since the game remains interactive in pause mode. To add insult to injury, I mark selected cells by making them darker. So I restricted this effect to when you you have full pause, like when a dialog pops up. I will signal normal pause some other way, probably by a blinking play button. Here is a sample:
You can see here the new action bar too. The three letter labels will be replaced with suggestive icons. No, not that kind of suggestions!
The question is how does one make such an effect. There are multiple ways, so I chose the fastest and easiest method, which literally took me three lines of code. It is a known fact that drawing something with transparency is slower than without. It is less known that colorizing can also share these properties. Depending on the hardware/rendering mode combination, I saw anywhere from better to 3-4 times worse performance when doing colorization. It is particularly bad with software rendering. So this effect could be improved.
I am not going to add a possibly disastrous rendering mode without making it optional, am I? This leads me well to my next topic.
In game options dialog
See the small button in the lower left corner of the screen? Pressing this bring up the following dialog:
The layout is a little bit rough around the edges, but this is a full featured dialog, with current quality standards (i.e. it opens up with a small animation and buttons enable and disable properly based on the values entered in the dialog).
The first option enables or disables the above mentioned pause effect. It is enabled by default.
The second option controls Super Special Rendering. It is enabled by default. But what could this option be? Foreshadowing!!! FORRREEEESHADOWING!
The third option controls the number of Z levels that will be rendered, the fourth the rendering of half height/full height walls and the fifth the rendering of wall borders.
Then we have three numeric values that can be altered. Speed is the speed of the time compression algorithm. This will allow you to fine tune the speed if you are not happy with the default of 6. Values range from 1 to 12. Zoom controls the three zoom levels that you are already accustomed to seeing. And finally we have cycle, that is related to the in game cycle clock. It is the scheduling "FPS", or more correctly the delay. This option is more for me than for the potential player, and it might not make the final product. Values are from 0 to 30.
And finally the small button in the corner, that unceremoniously closes this option panel (read: without animation).
Tuesday, May 17, 2011
Wow, it has been total chaos lately around here. I kept jumping all around the code base, doing new things here and there and not having time to post. And I found myself in another impasse regarding graphics. And finally, there was a 2-3 day Blogger outage, so I found myself posting the last two updates on Twitter.
So let us sort things out by going over the changes! No joke here…
And I hope I can get through my entire backlog this week.
Item creation overhaul
This is the first of the Twitter announced changes. I have greatly simplified the item creation API. It is still in its own namespace and this is fairly redundant, so I’ll have to fix this, but it can be done by copy & paste. This change is the first phase of the item system overhaul. In the end the new system will occupy a lot less memory. What am I going to do with the extra memory? Make the largest possible map size bigger of course!
GUI skin mock-up
This is the second Twitter announced change. I can’t remain forever using a GUI that looks like Windows 95. The new skin is starting to take shape, but it will take some time until it looks good enough to adopt it as the new standard.
Phasing out Irrlicht GUI component
It is no secret to the readers of this blog that I am not happy with the Irrlicht GUI component. So I am phasing it out. I will still use Irrlicht, but only the drawing primitives and use my own custom and very light weight GUI system. This way, if I ever break up with Irrlicht, I can easily sway the results of the otherwise messy divorce in my direction. Hmmm, I’m not sure this metaphor works. The idea is that it will be easy to replace Irrlicht, because my entire GUI will not depend on it. The new GUI system will go hand in hand with the new skin to create the ultimate GUI experience! GUI! This is not an ideal solution, especially since doing a mouse and keyboard driven highly robust event based light weight GUI is not that easy. Nobody enjoys coding the WhenMouseEnter and WhenMouseLeave events, but I really want to get rid of Irrlicht GUI and don’t want to add a new dependency.
New action toolbar
As part of the phasing out process, I have created a new action toolbar:
You can see it under the old one. The buttons lack icons and the entire thing is not skinned so it does not look better than the old one. This is only temporary. I will be keeping both for a few days, until the new one will have passed the challenge of time and proved to be stable.
So why do I need a new bar when it behaves the same way as the old one? Because it doesn’t. Let’s see what happens when I mouse over one of these buttons:
We get a window giving an overview of the actions possible and a few extra instructions. This panel is animated and it grows out of the rest of the window with a smooth growth. It is impossible to illustrate this without an animation, especially since the effect is very short, but here is a frame from the middle of the animation:
The text is not visible while the animation is playing. This is intentional. I’ll show the whole thing animated in one of my future YouTube videos.
Let’s see the rest of the panels. I am sure there are a few spelling errors in the descriptions and I need to fix the formatting a little:
That is it for today’s post. There are lots of new things to talk about that are done, but I’ll leave that to another post.
Wednesday, May 4, 2011
And here is the video that I promised, featuring the new incarnation of the tree system. The new trees have leaves that advance naturally from season to season and that can be harvested. Additionally, tree cutting has a new phase and is now a two step process: cutting down the tree and chopping it into smaller logs.
I would love to go into depth, describing every new feature, but I am leaving again on holiday. I know that I came back from my last holiday only a week ago. Well, what can I say? The life of a game developer with dwarves is the life of a rockstar!
I'll leave you with the video which has annotations, giving you a few more details.
Monday, May 2, 2011
So I haven't posted in over two weeks and what do my eyes see? Where are all the comments? Where is the outrage? Where is the lynch mob? The end-of-days-preachers? I blame it on that rabbit. No masses declaring me dead? I really need to hire Elmer Fudd or some other high profile hunter to solve this problem.
OK, enough fooling around. It was quite hectic getting my last post out, with finishing the pre-alpha, posting on the blog, Twitter, finishing the video, solving the annotation problems, etc. so I forgot to write the outro. In this outro I wanted to say that my next post or several were going to contain something "big", so I might refrain from posting screenshots until this new big thing was done.
Then the pre-holiday period came. I was on holiday for six days, and then I just din't have time to post or code the way I usually do it. Not to say that I didn't work. I worked, but not in an organized manner; I worked here and there, grabbing a few free windows. I actually roughly finished my big thing, and also made a lot of other changes. I wanted to post the small changes on Tuesday and the new thing on Friday, but because of the above mentioned delay, I'll post the small things today and the rest in a couple of days.
So first I started with some YAML support. Not as much support, as hacking the entire underlying XML API to enable it to load a YAML file and use the information from it to populate XML data structures. I managed to get it to load some simple but sufficiently complex YAML files (for the need of the game) and... that's about it. The next step would be to take XML data structures and save them as YAML and then transition all the data from the game automatically from XML to YAML. Why YAML? Silly programmers with their silly preferences vis-a-vis competing and sometimes interchangeable technologies. So no real reason. And I am no longer sure I'll follow this path, because YAML is deceptively complicated. But why hack the XML componenets to understand YAML, when I could be using a YAML library? Something like yaml-cpp. Or something else. Integrating such a library should be fairly easy. But here's the thing: hacking the XML component is fun, integrating an existing library is boring, and rewriting the entire serialization code for the game's data structures for a new library would be pure hell. Who enjoys writing serialization code? If you know such persons please do your best to avoid them! This has been a public service announcement brought to you by DwarvesH! In U++ it is very easy to write such code, both for normal serialization and marshaling (XML serialization), but still, I wouldn't like to rewrite it. With my hack, I wouldn't need to touch the serialization code. Not to insult anybody, but the situation would be different if I had a code monkey :).
Another thing I created is a new system for building tile sheets. There were some problems with the previous system. One of the big problems is that you need great flexibility, especially once I'll have a full featured modding system. In order to achieve this, I wrote a system that takes a random sized image and integrates it into the tilesheet system, calculating all indexes and coordinates so you don't have to. Here I stumbled again over the problem with hardware textures: their sizes must be powers of two. Forgetting about this, I lost over 50 FPS. It took me 10 minutes to figure out why. Not only that, but on my machines, a 512x256 texture is slower than a 512x512 one. So I ended up wasting a lot of space with filler space in textures. Then I discovered that also the number of textures is important, with less being better. Which did not fit with the increased number of textures due to the white space.
So my first system was doomed. My second system is a lot better. It still creates tilesheets automatically, but now it can do it without any white space. You define a number of smaller tilesheets and give some basic info. In the future I'll add a GUI component to the editor to do this. And another pending feature is to change indexes from absolute to small tilesheet specific. This will work great with mods, where each mod can add any number of tiles without disturbing other tiles, either adding new objects or modifying existing objects.
As you can see, these new features are highly experimental and took a short time to write the first time, but then it took a lot to rewrite them in a more appropriate manner using my new knowledge acquired experimentally. But there is one new feature that is intentional and a natural evolution of the game.
You see, most of the time you can't queue up more than one action on an object. As an example, you can only give the order to plant or harvest a plant. I am talking about the same plant. You can't queue up a plant operation and a harvest, because the plant will take months to grow. But some objects permit this. You can queue a wall smooth operation, followed by engraving and finally digging. You can then unpause the game (if you paused, but with the new scheduler pausing should be quite rare) and the actions will be executed in order. But this is a thing of the past. I modified designations so you can only have a single designation on an object at a time. Things are simpler this way and more intuitive. It also fits better with the new more forgiving scheduler. As an added bonus, this way you won't make mistakes where you designate a huge area and in a sub region you forget to add a designation, queuing up let's say 3 actions in the general area, but only 2 in the area where you miss-selected. So this is intentional, but I'm sure I'll have a few bug reports regarding this once I release. So be prepared to meet mister "not a bug, a feature".
I also made some changes to scheduling filters yet again, enough to fill half a post if I described them here. And some other changes, which I won't mention. OK, I'll mention one more: in the past I had multiple files which represented a single logical entity. This is confusing, so from now on I have one single XML file describing game data (i.e. that is a single logical entity). Every mod or content pack will have one single XML file and zero or more PNG files. But I can split it up if I wish, making one content pack for plants, one for animals, one for stone, etc. These will be self contained, but able to communicate.
So in my next post, I'll be presenting the new feature I have been working on. This should give you a fair overview of the final form (for now) of one of the game's features. So you've guessed it: the second presentation video/tutorial. Let's hope this brings more controversy than the first one! Everybody loves controversy! The features are done, all I need is to finish the script for the video. Yes, they do have scripts! In the past I also needed to practice my videos, but I am getting better and I only need 2-3 takes. You can mess up a video with one single miss click or longer moment of inactivity.