Well from everything I've read, heard and discussed, I think Socks asked a reasonable and valid question regarding clearer and easier to understand information regarding the new rendering engine.
It's not a clear cut win for everyone, and will actually have a negative affect for some game types particularly ones like point and click games with large sized high quality/detailed graphics. It's going to be great for many users games, and not so great for others.
There's lots of claims about faster loading etc... But as yet none of that has been proven. Im guessing there has to be some trade off somewhere... Surely the chopping up of the assets, comparing them for repeats etc, then recompiling them all back together has to take some time somewhere.
I hope it's as all singing and dancing as claimed... It's been a long time in the making... And any improvement should surely be for the better?
But again..like Socks said, it would just be better if clear and easy to find info were provided... ideally with data, and comparative examples.
And surely... @Lovejoy ..for all the posts arguing it out.. You could have just given any info you know and used a lot less time and words...
I am not sure what information is lacking we have a very in depth technical presentation here. Then last month we talked some more about it at the end of the meetup. Also if you are pro you can download 0.14 and try it out or if you have windows you can try it out with 0.13.14 native preview. I am happy to answer any question you have and i will even answer them at the meetup this month.
Adobe upgrades and changes
facebook upgrade and changes
Xcode upgrade and changes
Flash upgrade and changes
apple policy upgrade and changes
google upgrade and changes
iOS upgrade and changes
Spine upgrades and changes
Logic upgrade and changes
I respect what socks is saying........
All those examples you just listed is the same reason why i dont understand why we are still considered being in beta with GS creator. Its a product. its a product that has been out to the public for a long time, has a price guide, has had multiple updates, has had multiple features added. It customers are profiting from what the product initially provide and profit from what it continues to provide. Why is it still considered Beta? What has to happen for them to say. 1.0? Do they know its ok to eventually say 1.1 or 2.0?
@Chunkypixels said:
Well from everything I've read, heard and discussed, I think Socks asked a reasonable and valid question regarding clearer and easier to understand information regarding the new rendering engine.
It's not a clear cut win for everyone, and will actually have a negative affect for some game types particularly ones like point and click games with large sized high quality/detailed graphics. It's going to be great for many users games, and not so great for others.
There's lots of claims about faster loading etc... But as yet none of that has been proven. Im guessing there has to be some trade off somewhere... Surely the chopping up of the assets, comparing them for repeats etc, then recompiling them all back together has to take some time somewhere.
I hope it's as all singing and dancing as claimed... It's been a long time in the making... And any improvement should surely be for the better?
But again..like Socks said, it would just be better if clear and easy to find info were provided... ideally with data, and comparative examples.
And surely... Lovejoy ..for all the posts arguing it out.. You could have just given any info you know and used a lot less time and words...
I feel like there is a lot of little things wrong with 14.0 branch that just haven't been found yet because it was put on the backburner until 13.14 came out (appreciated). So we'll see what issues remain after a stable 14 comes around.
That being said, for me, it's scary putting my large project in the new rendering engine. So far results haven't been the best (low RAM usage at the cost of poor FPS and graphical glitches). I'm glad 13.14+ is around though, may have to use that for publishing in the future.
@BlackCloakGS said:
I am not sure what information is lacking we have a [very in depth technical presentation here]
Cheers, I've actually already watched that one, with it getting on for being a year old now I wasn't sure how much of it was still relevant, but I'll take your recommendation as indicating that it's all still current information (and give it another watch too ).
@BlackCloakGS said:
I am happy to answer any question you have . . .
Overall what would you say is going to be the main benefit of the new rendering system - Faster loading times ? Better handling of large numbers of actors ? More consistent frame rates (where the processor is pushed) Smaller RAM footprint ? (etc).
I'm also guessing that we will see efficiency improvements in things like tiling and the Replicate behaviour ?
The new rendering system breaks an image up into 64x64 tiles that are LZ4_HC compressed so they should be comparable to PNGs. Tiles are streamed into one to four large mega textures of 2048x2048 apiece, which is the maximum texture size on most mobile devices. So you can fit 4096 tiles into memory at once. If you have more than 4096 tiles the older tiles will be evicted from memory and new tiles streamed in to take their place. So you never run out of texture memory ever! That's one big advantage. Memory doesn't really matter any more. Your scene can be as big as you like. What you'll see is a slight pause in the frame rate whenever you load new tiles from disk but it's not a huge pause.
The other main advantage to the new rendering system is speed. Almost everything on the screen in your game will be issued to the GPU with one draw call. Awesome batching madness.
@Toyman said:
The new rendering system breaks an image up into 64x64 tiles that are LZ4_HC compressed so they should be comparable to PNGs.
So I guess the compression is lossless ?
Also (related somewhat) does the tile comparison algorithm, the process that searches the database for identical tiles, have a threshold when comparing values or is it absolute ? I'm guessing it's absolute, so if I were to have a 320 x 64 image, a flat yellow colour, then I apply a tiny amount of noise to it (even just 1%) I would end up with 5 x 64 x 64 pixel tiles rather than 1 x 64 x 64 pixel tile if I had not applied any noise - as the noise will defeat the comparison algorithm ?
. . . . . . . . . .
Also, do the tiles need to be vertically and horizontally aligned so as to pass the comparison algorithm ?
Example, I'm going to guess that this will give us a single 64 x 64 tile . . .
. . . whereas this will give us 5 x 64 x 64 tiles ?
So for example this image (below) would not fare well with the new system ? . . .
. . . so you might be better off authoring the above image like this (below), rotated so it takes full advantage of the tiling system . . .
. . . and then rotate it within GameSalad back to the correct angle ?
(assume a noiseless flat yellow colour in all these images - ignore the compression artefacts introduced by uploading to the forum)
. . . . . . . . . .
And also related . . . does the system recognise rotated tiles ?
. . . . . . . . . .
@Toyman said:
Tiles are streamed into one to four large mega textures of 2048x2048 apiece, which is the maximum texture size on most mobile devices. So you can fit 4096 tiles into memory at once.
In the video that @BlackCloakGS linked to, it says that these megatextures are 4096 x 4096 pixels (and 64MB) each ? I'm guessing with the video being almost a year old some of the specs have changed over time ?
And the four megatextures are per scene, rather than the whole project ?
Do the megatextures organize themselves on a pixel level or do they use the file dimensions, for example imagine I have imported a circle of 200 x 200 pixels (let's assume the circle itself is full of random noise while the B/G is completely transparent), is this written to the megatexture as a 200 x 200 pixel square, with the empty areas around the corners of the square left empty, or does the process move other data into these 'corners' - or otherwise efficiently manage the available space ?
. . . . . . . . . .
@Toyman said:
If you have more than 4096 tiles the older tiles will be evicted from memory and new tiles streamed in to take their place. So you never run out of texture memory ever! That's one big advantage. Memory doesn't really matter any more. Your scene can be as big as you like. What you'll see is a slight pause in the frame rate whenever you load new tiles from disk but it's not a huge pause.
Fantastic stuff !!
@Toyman said:
The other main advantage to the new rendering system is speed. Almost everything on the screen in your game will be issued to the GPU with one draw call. Awesome batching madness.
That's what I like to hear
I saw that an additional blend mode (let's say Screen) will require an additional draw call, what would be the effect of an additional draw call, for example in a project where you have 20 or 30 actors floating around the screen, and one of those actors is set to Screen, would we notice two draw calls instead of one ?
I imagine there are plenty of situations where you could use preparation outside of GS (for example in Photoshop) to negate the need for a blend mode if it came to it.
And while on the subject of blend modes, we ever see Multiply fixed !! It would be enormously more useful if it worked as Multiply works in pretty much all other software.
Also . . . . are the megatextures are generated at the publishing stage ?
And if this is the case does it mean that we won't see the improvements in speed when using the GameSalad viewer on - for example - an iPad ?
. . . . . . . . .
And if the rendering system breaks images into 64 x 64 pixel tiles can we now (or at least when 14.1 is officially out) largely forget the recommendation to keep our image asset dimensions within the powers-of-2 rule ?
@JSproject said:
Socks from my understanding and based on your orange pixture examples
1 and 2 would be a single 64x64 tile.
3. a single 64x64 tile also (but on this one I might be wrong)
4. two 64x64 tiles
I'm assuming that the tiling system won't recognise rotated or flipped or offset tiles (and at this stage it really is only an assumption), so from what I can tell . . .
1 would be 1 unique tile
2 would be 10 unique tiles
3 would be 11 unique tiles
4 would be 3 unique tiles
(unique tiles marked with a back dot, duplicate tiles marked with a blue dot)
Thanks everyone for the great questions and illustrations from Socks. I couldn't have done better myself! I'll try and clarify.
The compression scheme is lossless. It's basically a really high compression engine that kicks Zip in the pants and gets away with it. Another HUGE advantage to LZ4_HC is compression is slow but decompression is uber fast. In other words it takes a long time to compress (a second or two) a tile but decompression, the biggest factor in our load times, is very quick. Faster off the disk. Faster to decompress. It's microseconds to get it onto the video card.
Tile comparisons are absolute. The new code coming online soon creates a SHA1 hash of the tile texels and uses that as an index into a database of tiles. Thus tile sharing is across all images in the database. If you have a solid block of 64x64 white it will be used in every image. This keeps the mega-texture use small.
Noise will defeat the comparison algorithm. Yes you're right.
Consider a fullscreen image that has lots of solid color and you wanted to add a nice lighting effect. It would be FAR better to let the tile ripper have at the solid colors without the lighting effect. You'd generate fewer tiles. Save the lighting effect as a standalone greyscale image and pull it into Creator as a separate actor. Set the blend mode to multiply to get your lighting effect. Compositing is key to small tile counts.
Your examples above are spot on. 3 would have more sharing than you're showing.
The system does not recognize rotated or flipped tiles. It should. It just doesn't right now. I'll add that to my TODO list.
The four mega-textures are per scene.
To clarify: the size of the mega-texture actually depends on the device you're running on. For OSX Mac it'll be the maximum texture size that can fit into video memory! If you have a GPU with only 256MB of VRAM then you're only going to get a couple of 2048x2048 mega textures, not four. There will be more swapping from disk. For a fully loaded Mac with 2GB of VRAM you'll get four 16384x16384 mega textures. It only creates mega-textures 1-3 when it needs to, which would be never in most cases on high end hardware because the entire scene would fit onto one texture. On Windows we use texture resizing as well. It starts out at 512x512 and will grow up to the max texture size that will fit in VRAM. It's not always the same number. It depends on the machine. iOS and Android will always use a max texture size of 2048x2048. An iPhone 4 has less memory than an iPhone6 and the mega-texture allocator will take that into account. There's a fudge factor in there as well because we don't want to take up ALL your device's memory for tiles! That would be crazy. So most of the time it's the available memory after all the other assets have loaded, like sounds.
Draw calls are wonderfully expensive in OpenGL. Not so much in Metal, which we'll get to once Graphene's out the door! And DirectX. But I digress. If you only have thirty actors on screen you'll not notice a difference between one to a handful of draw calls. Once you start doing particles, however, it really matters. Particles are ripped into tiles too! For every image you draw I generate a list of triangles for every tile. So the larger the image the more geometry you have. The other thing we don't do but is on my list of things to fix is sort by material and Z order. If you batch all the screen blends together then that's one draw call. Batch the others too and that's just two. Basically what I'm saying is this. The draw calls have a small cost in and of themselves but it's the attendant state changes and buffer changes that are very costly. One batch can have up to 65,536 vertices spanning many tiles. There are 6 vertices per tile. So that's a lot! Each vertex is 36 bytes long. If you break the scene drawing into four draw calls due to blend mode changes then you have to i) write the vertex data to the VBO, ii) set all the render states, iii) upload uniforms to the shader, iv) and send all that data down to the video card. Some devices (Windows and Android) must upload all 65536 vertices every time. You can see how that gets expensive. OSX and iOS it's not so bad because you can elect to send only the vertices you actually changed down to VRAM by DMA. iOS is especially nice, and will be UBER AMAZING with Metal, because VRAM and regular RAM are all one memory pool. Just like the Xbox.
Mega Textures are generated on the fly in the game. They change constantly as new art is loaded and old art goes bye-bye.
There is no power-of-two texture limit any more. Make your images any size you like. If the dimensions are not product-of-64 I will pad your image with black with zero alpha.
The GameSalad viewer does not use the image database. It loads raw PNGs and rips them into tiles at load time. It's slow but mandatory for now because of the way the images are sent across the network. I couldn't change everything!
Hopefully that wasn't confusing!
From this discussion I come away with the impression that we'd like to see the ripper recognizing rotated and flipped/mirrored tiles. Can do, once I get some time.
Also sorting tiles by Z order and material would be awesome.
If mega-textures are generated on the fly, what determines the point of there creation. Is it on scene load, when a device meets the upper ram limitations, or just what is in the camera visible for rendering?
Are mega-textures dynamic themselves, can they swap one unused texture for another that is needing to be used, without completely taking apart the old one?
How does it currently handle blend modes? I know i've seen some major issues with them in the nightly, im guessing it's because it's not differentiating textures on actors properly from the rest of the mega-texture and accidentally rendering them as a blend mode as well?
Finally, is it possible that you can overload the system with too many textures trying to be loaded and not enough room in the megatexture, because it can't dump everything it needs to (maxing out ram), or it's just not possible?
If so, would this RAM limitation be any lower then the older engine since you have imposed some hard coded limitations (fudge factor)?
Oh yeah, also what about the option to pre-ceate your own mega-textures, or the ability to flag important assets that should not be dumped and reloaded, but kept in memory?
Thanks for the response, lots of really useful information.
@Toyman said:
1. The compression scheme is lossless. It's basically a really high compression engine that kicks Zip in the pants and gets away with it. Another HUGE advantage to LZ4_HC is compression is slow but decompression is uber fast.
Lossless + uber fast decompression sounds great to me !!
@Toyman said:
6. The system does not recognize rotated or flipped tiles. It should. It just doesn't right now. I'll add that to my TODO list.
That would be great, I imagine you get a point where extra searches for duplication start to return less and less results and just make the processing slower and slower, but rotated and/or flipped tiles would be useful.
@Toyman said:
8. To clarify: the size of the mega-texture actually depends on the device you're running on.
Thanks, that all makes sense.
@Toyman said:
9. If you batch all the screen blends together then that's one draw call. Batch the others too and that's just two.
Could these screen blends, eating up their own draw call, be mixed - by that I mean could a single draw call handle all the actors with a blend mode other than normal (Multiply, Screen, Add . . . etc) - or is a separate draw call required for each blend mode ? - a separate draw call for Add - a separate draw call for Multiply . . . etc ?
@Toyman said:
9. Basically what I'm saying is this. The draw calls have a small cost in and of themselves but it's the attendant state changes and buffer changes that are very costly
Makes sense.
@Toyman said:
11. There is no power-of-two texture limit any more. Make your images any size you like. If the dimensions are not product-of-64 I will pad your image with black with zero alpha.
Makes sense too ! Yeah to 1200 x 40 ground images ! Can we have custom padding colours ? (that was a joke by the way )
@Toyman said:
12. The GameSalad viewer does not use the image database. It loads raw PNGs and rips them into tiles at load time. It's slow but mandatory for now because of the way the images are sent across the network. I couldn't change everything!
I suspected that was the case, so although we can preview our games on the viewer if we really want to see the full impact of the new rendering system we would have to make an ad hoc ?
No, that was some very useful information, cleared up a lot of loose ideas, it's sounding like this could be an enormous leap forward when everything is eventually settled into place !
@Toyman said:
From this discussion I come away with the impression that we'd like to see the ripper recognizing rotated and flipped/mirrored tiles. Can do, once I get some time.
If it would increase efficiency (obviously not at the processing stage) then those additions would be great - for example the last example (4) of my illustrations would go from 3 tiles to just two 2 tiles with rotated tiles being recognized.
@Toyman said:
Also sorting tiles by Z order and material would be awesome.
Yes please !
Nate_SNate - Pixel Artist - Developer - Graphics DesignerMemberPosts: 86
Thanks!
**Nate - Pixel Artist - Developer - Graphics Designer **
If mega-textures are generated on the fly, what determines the point of there creation. Is it on scene load, when a device meets the upper ram limitations, or just what is in the camera visible for rendering?
Are mega-textures dynamic themselves, can they swap one unused texture for another that is needing to be used, without completely taking apart the old one?
How does it currently handle blend modes? I know i've seen some major issues with them in the nightly, im guessing it's because it's not differentiating textures on actors properly from the rest of the mega-texture and accidentally rendering them as a blend mode as well?
Finally, is it possible that you can overload the system with too many textures trying to be loaded and not enough room in the megatexture, because it can't dump everything it needs to (maxing out ram), or it's just not possible?
If so, would this RAM limitation be any lower then the older engine since you have imposed some hard coded limitations (fudge factor)?
When tiles are visible on screen.
The way Mega Textures work is like this. A logical tile index is maintained, call it the tile cache, that holds up to 65536 tile entries. When an actor is loaded from the scene it queries the image database for all the tiles. It checks against the tile cache first to see if it's already been loaded and does nothing more if it's already in the cache. If the tiles are not in the cache it puts in a load request with the image database to pull in all the tiles the image needs. They are all loaded into the tile cache. Now when it comes time to draw the actor all its tiles are clipped against the screen. So if you have a 1024x960 screen, for example, and the image is 2048x2048 not all of those tiles will be loaded or drawn. Only the visible ones. As tiles become visible due to scrolling those tiles will be pulled into the tile cache. Everything is based on visibility for this streaming code. Now once the tile texels have been loaded the mega-texture is updated. Now if you ever get into a situation where half of the screen fits into all the mega-textures we will draw all those actors with the tiles that are in memory first, draw them (breaking the batch), then load the next set of tiles, download them, and draw them. Frame rate will suffer in these situations because of disk loads and tile decompression. You have about 4096 tiles on average (if four mega-textures at 2048x2048 each).
Mega-textures are completely dynamic. You can have a percentage of an actors tiles in one mega texture, a percentage in another, and a third percentage on disk. On disk actors will only be loaded if you try to draw it on screen.
Blend modes are handled the same way as the old engine. If the blend mode changes between actors the batch is broken, a draw call submitted to the GPU command buffer, and we continue on our merry way. Apparently there are some bugs with blend modes, which I will have to fix and soon.
In the old system if you added too many textures the system would just eventually crash. Every actor was an individual draw call. It was using GL1.1 calls (no VBOs and VAOs) even with GLES2. Now you will never run out of memory. You just have actors that can have tiles in one of three states: 1) in tile cache, 2) in VRAM and 3) on disk. By overwhelming the system with super massive textures, which is bad, and not compositing nicely with multiple actors using the blend modes, all you'll do is slow down your frame rate because you have to evict visible tiles. On Windows memory use is better than the old system because I resize the mega-textures as the load increases from 512x512x4 up to 16384x16384x4 or 262,144 tiles in VRAM. I could not do this on iOS devices due to a bug in the GL driver which didn't flush the texels after I resized. Very annoying. Android is just damned crippled with it's GL|ES 2 implementation. It lies. The bottom line is the new system will use more memory for small levels with few tiles because large swaths of the mega-texture will be wasted. On large games it'll use far less if GameSalad could run the game at all! If you build your levels badly with massive noisy textures you will lose all the performance that we gain from batching draw calls. SSDs on mobile are fast but not that fast. I can improve with a FIFO queue but it's not in there now.
The RAM limitations are going to be slightly higher on OSX, iOS and Android because I have to allocate one big 2048x2048 (or whatever fits in VRAM, may be smaller for systems with limited VRAM) RGBA texture up front. Not four. I only allocate four if I need to. The biggest problem this solves is memory fragmentation. With the old system you could actually run out of video memory when the system reported enough space! This was because of fragmentation: lots of little textures means a lot of wasted space if you start deleting just a few of them.
@AlchimiaStudios said:
Oh yeah, also what about the option to pre-ceate your own mega-textures, or the ability to flag important assets that should not be dumped and reloaded, but kept in memory?
No support for creating your own mega-textures. You can't mark assets that are important because you don't know ahead of time what their draw order is. The tile streaming is based on visibility per tile not per image. It would be a pain in the bum to assign individual priorities to tiles.
@Socks said:
Could these screen blends, eating up their own draw call, be mixed - by that I mean could a single draw call handle all the actors with a blend mode other than normal (Multiply, Screen, Add . . . etc) - or is a separate draw call required for each blend mode ? - a separate draw call for Add - a separate draw call for Multiply . . . etc ?
Imagine you had actors A, B, C, D and E with blend modes A=Normal, E,B=Screen and C,D=Multiply. You would have three draw calls if the Z order was A, EB, CD. If the Z order was A, B, C, D, E then that would be four draw calls.
That would be great, I imagine you get a point where extra searches for duplication start to return less and less results and just make the processing slower and slower, but rotated and/or flipped tiles would be useful.
Actually you store off a bit for horizontal flip and another for vertical flip with the tile in the index and there's no extra processing. Only processing is to flip and mirror in the image database builder tool. Doesn't take long, especially with SSE compares.
I suspected that was the case, so although we can preview our games on the viewer if we really want to see the full impact of the new rendering system we would have to make an ad hoc ?
An ad hoc would only show you load times because Creator and the Viewer do all the work that the image database creator tool (gsidb) does. The actual usage of tiles is the same. The only difference really is that the Player loads from an image database that's been produced at publishing time and Viewer/Creator loads from PNGs. Tiles are still produced no matter what.
@Toyman How will the system handle a situation where a series of actors are scrolling down the screen, wrapping around to the top again, having their image changed, and scrolling back onto the screen in a continuous loop? There would be a decent amount of shared tiles amongst the images they'd be displaying, but hundreds of different possible images.
@Toyman said:
Imagine you had actors A, B, C, D and E with blend modes A=Normal, E,B=Screen and C,D=Multiply. You would have three draw calls if the Z order was A, EB, CD. If the Z order was A, B, C, D, E then that would be four draw calls.
Thanks, makes sense - best to keep blend modes in grouped layers then !
@Toyman said:
Actually you store off a bit for horizontal flip and another for vertical flip with the tile in the index and there's no extra processing. Only processing is to flip and mirror in the image database builder tool. Doesn't take long, especially with SSE compares.
@Toyman said:
An ad hoc would only show you load times because Creator and the Viewer do all the work that the image database creator tool (gsidb) does. The actual usage of tiles is the same. The only difference really is that the Player loads from an image database that's been produced at publishing time and Viewer/Creator loads from PNGs. Tiles are still produced no matter what.
Great, so I'm going to interpret all that as the viewer giving a fairly good representation of final performance.
What about backwards compatibility @Toyman ? Do you expect old render engine games to run perfectly, or will we be stuck on this current version of GS if we don't want to redo existing games? Also, what about animation? I assume this new rendering engine will have some pluses and minuses in regards to animation performance. I noticed in the latest nightly that the animation speed has increased during testing in the Creator, but decreased in adhoc builds on the device.
Comments
Well from everything I've read, heard and discussed, I think Socks asked a reasonable and valid question regarding clearer and easier to understand information regarding the new rendering engine.
It's not a clear cut win for everyone, and will actually have a negative affect for some game types particularly ones like point and click games with large sized high quality/detailed graphics. It's going to be great for many users games, and not so great for others.
There's lots of claims about faster loading etc... But as yet none of that has been proven. Im guessing there has to be some trade off somewhere... Surely the chopping up of the assets, comparing them for repeats etc, then recompiling them all back together has to take some time somewhere.
I hope it's as all singing and dancing as claimed... It's been a long time in the making... And any improvement should surely be for the better?
But again..like Socks said, it would just be better if clear and easy to find info were provided... ideally with data, and comparative examples.
And surely... @Lovejoy ..for all the posts arguing it out.. You could have just given any info you know and used a lot less time and words...
I am not sure what information is lacking we have a very in depth technical presentation here. Then last month we talked some more about it at the end of the meetup. Also if you are pro you can download 0.14 and try it out or if you have windows you can try it out with 0.13.14 native preview. I am happy to answer any question you have and i will even answer them at the meetup this month.
Adobe upgrades and changes
facebook upgrade and changes
Xcode upgrade and changes
Flash upgrade and changes
apple policy upgrade and changes
google upgrade and changes
iOS upgrade and changes
Spine upgrades and changes
Logic upgrade and changes
All those examples you just listed is the same reason why i dont understand why we are still considered being in beta with GS creator. Its a product. its a product that has been out to the public for a long time, has a price guide, has had multiple updates, has had multiple features added. It customers are profiting from what the product initially provide and profit from what it continues to provide. Why is it still considered Beta? What has to happen for them to say. 1.0? Do they know its ok to eventually say 1.1 or 2.0?
I feel like there is a lot of little things wrong with 14.0 branch that just haven't been found yet because it was put on the backburner until 13.14 came out (appreciated). So we'll see what issues remain after a stable 14 comes around.
That being said, for me, it's scary putting my large project in the new rendering engine. So far results haven't been the best (low RAM usage at the cost of poor FPS and graphical glitches). I'm glad 13.14+ is around though, may have to use that for publishing in the future.
Follow us: Twitter - Website
Cheers, I've actually already watched that one, with it getting on for being a year old now I wasn't sure how much of it was still relevant, but I'll take your recommendation as indicating that it's all still current information (and give it another watch too ).
Overall what would you say is going to be the main benefit of the new rendering system - Faster loading times ? Better handling of large numbers of actors ? More consistent frame rates (where the processor is pushed) Smaller RAM footprint ? (etc).
I'm also guessing that we will see efficiency improvements in things like tiling and the Replicate behaviour ?
Cheers in advance for any info.
Why is it still considered Beta? What has to happen for them to say. 1.0?
Personally I find it puzzling why people care what gamesalad is called? It is what it is and does what it does.
If it's called beta or 1.0 or version Peanut Butter.......it does the same thing????????
That's What makes it so easy to ask. Curiosity.
I'm not sure of GS beta name strategy?
Curiosity is totally valid reason.
Sorry thought you were stressing out about it.
Good update - thanks for the weekly posts!
My GameSalad Academy Courses! ◦ Check out my quality templates! ◦ Add me on Skype: braydon_sfx
I miss digging ditches sometimes, shovels and picks never change. No new technologies there. Just different handle materials
@Socks
The new rendering system breaks an image up into 64x64 tiles that are LZ4_HC compressed so they should be comparable to PNGs. Tiles are streamed into one to four large mega textures of 2048x2048 apiece, which is the maximum texture size on most mobile devices. So you can fit 4096 tiles into memory at once. If you have more than 4096 tiles the older tiles will be evicted from memory and new tiles streamed in to take their place. So you never run out of texture memory ever! That's one big advantage. Memory doesn't really matter any more. Your scene can be as big as you like. What you'll see is a slight pause in the frame rate whenever you load new tiles from disk but it's not a huge pause.
The other main advantage to the new rendering system is speed. Almost everything on the screen in your game will be issued to the GPU with one draw call. Awesome batching madness.
Thanks for the response, much appreciated . . .
So I guess the compression is lossless ?
Also (related somewhat) does the tile comparison algorithm, the process that searches the database for identical tiles, have a threshold when comparing values or is it absolute ? I'm guessing it's absolute, so if I were to have a 320 x 64 image, a flat yellow colour, then I apply a tiny amount of noise to it (even just 1%) I would end up with 5 x 64 x 64 pixel tiles rather than 1 x 64 x 64 pixel tile if I had not applied any noise - as the noise will defeat the comparison algorithm ?
. . . . . . . . . .
Also, do the tiles need to be vertically and horizontally aligned so as to pass the comparison algorithm ?
Example, I'm going to guess that this will give us a single 64 x 64 tile . . .
. . . whereas this will give us 5 x 64 x 64 tiles ?
So for example this image (below) would not fare well with the new system ? . . .
. . . so you might be better off authoring the above image like this (below), rotated so it takes full advantage of the tiling system . . .
. . . and then rotate it within GameSalad back to the correct angle ?
(assume a noiseless flat yellow colour in all these images - ignore the compression artefacts introduced by uploading to the forum)
. . . . . . . . . .
And also related . . . does the system recognise rotated tiles ?
. . . . . . . . . .
In the video that @BlackCloakGS linked to, it says that these megatextures are 4096 x 4096 pixels (and 64MB) each ? I'm guessing with the video being almost a year old some of the specs have changed over time ?
And the four megatextures are per scene, rather than the whole project ?
Do the megatextures organize themselves on a pixel level or do they use the file dimensions, for example imagine I have imported a circle of 200 x 200 pixels (let's assume the circle itself is full of random noise while the B/G is completely transparent), is this written to the megatexture as a 200 x 200 pixel square, with the empty areas around the corners of the square left empty, or does the process move other data into these 'corners' - or otherwise efficiently manage the available space ?
. . . . . . . . . .
Fantastic stuff !!
That's what I like to hear
I saw that an additional blend mode (let's say Screen) will require an additional draw call, what would be the effect of an additional draw call, for example in a project where you have 20 or 30 actors floating around the screen, and one of those actors is set to Screen, would we notice two draw calls instead of one ?
I imagine there are plenty of situations where you could use preparation outside of GS (for example in Photoshop) to negate the need for a blend mode if it came to it.
And while on the subject of blend modes, we ever see Multiply fixed !! It would be enormously more useful if it worked as Multiply works in pretty much all other software.
Cheers in advance for any input.
@Toyman
Also . . . . are the megatextures are generated at the publishing stage ?
And if this is the case does it mean that we won't see the improvements in speed when using the GameSalad viewer on - for example - an iPad ?
. . . . . . . . .
And if the rendering system breaks images into 64 x 64 pixel tiles can we now (or at least when 14.1 is officially out) largely forget the recommendation to keep our image asset dimensions within the powers-of-2 rule ?
@Socks from my understanding and based on your orange pixture examples
1 would be a single 64x64 tile.
2 and 3 a single 64x64 tile also (but on these I might be wrong)
4. two 64x64 tiles
I'm assuming that the tiling system won't recognise rotated or flipped or offset tiles (and at this stage it really is only an assumption), so from what I can tell . . .
1 would be 1 unique tile
2 would be 10 unique tiles
3 would be 11 unique tiles
4 would be 3 unique tiles
(unique tiles marked with a back dot, duplicate tiles marked with a blue dot)
@Socks yes that may very well be the case, we need some clarification from @Toyman and great illustrations btw
Thanks everyone for the great questions and illustrations from Socks. I couldn't have done better myself! I'll try and clarify.
Hopefully that wasn't confusing!
From this discussion I come away with the impression that we'd like to see the ripper recognizing rotated and flipped/mirrored tiles. Can do, once I get some time.
Also sorting tiles by Z order and material would be awesome.
@Toyman
If mega-textures are generated on the fly, what determines the point of there creation. Is it on scene load, when a device meets the upper ram limitations, or just what is in the camera visible for rendering?
Are mega-textures dynamic themselves, can they swap one unused texture for another that is needing to be used, without completely taking apart the old one?
How does it currently handle blend modes? I know i've seen some major issues with them in the nightly, im guessing it's because it's not differentiating textures on actors properly from the rest of the mega-texture and accidentally rendering them as a blend mode as well?
See Here
Finally, is it possible that you can overload the system with too many textures trying to be loaded and not enough room in the megatexture, because it can't dump everything it needs to (maxing out ram), or it's just not possible?
If so, would this RAM limitation be any lower then the older engine since you have imposed some hard coded limitations (fudge factor)?
Follow us: Twitter - Website
Oh yeah, also what about the option to pre-ceate your own mega-textures, or the ability to flag important assets that should not be dumped and reloaded, but kept in memory?
Follow us: Twitter - Website
Thanks for the response, lots of really useful information.
Lossless + uber fast decompression sounds great to me !!
That would be great, I imagine you get a point where extra searches for duplication start to return less and less results and just make the processing slower and slower, but rotated and/or flipped tiles would be useful.
Thanks, that all makes sense.
Could these screen blends, eating up their own draw call, be mixed - by that I mean could a single draw call handle all the actors with a blend mode other than normal (Multiply, Screen, Add . . . etc) - or is a separate draw call required for each blend mode ? - a separate draw call for Add - a separate draw call for Multiply . . . etc ?
Makes sense.
Makes sense too ! Yeah to 1200 x 40 ground images ! Can we have custom padding colours ? (that was a joke by the way )
I suspected that was the case, so although we can preview our games on the viewer if we really want to see the full impact of the new rendering system we would have to make an ad hoc ?
No, that was some very useful information, cleared up a lot of loose ideas, it's sounding like this could be an enormous leap forward when everything is eventually settled into place !
If it would increase efficiency (obviously not at the processing stage) then those additions would be great - for example the last example (4) of my illustrations would go from 3 tiles to just two 2 tiles with rotated tiles being recognized.
Yes please !
Thanks!
**Nate - Pixel Artist - Developer - Graphics Designer **
any news as to when beta testing begins for the new marketplace? Feels like its been half a year since any new content has been able to be uploaded.
The way Mega Textures work is like this. A logical tile index is maintained, call it the tile cache, that holds up to 65536 tile entries. When an actor is loaded from the scene it queries the image database for all the tiles. It checks against the tile cache first to see if it's already been loaded and does nothing more if it's already in the cache. If the tiles are not in the cache it puts in a load request with the image database to pull in all the tiles the image needs. They are all loaded into the tile cache. Now when it comes time to draw the actor all its tiles are clipped against the screen. So if you have a 1024x960 screen, for example, and the image is 2048x2048 not all of those tiles will be loaded or drawn. Only the visible ones. As tiles become visible due to scrolling those tiles will be pulled into the tile cache. Everything is based on visibility for this streaming code. Now once the tile texels have been loaded the mega-texture is updated. Now if you ever get into a situation where half of the screen fits into all the mega-textures we will draw all those actors with the tiles that are in memory first, draw them (breaking the batch), then load the next set of tiles, download them, and draw them. Frame rate will suffer in these situations because of disk loads and tile decompression. You have about 4096 tiles on average (if four mega-textures at 2048x2048 each).
Mega-textures are completely dynamic. You can have a percentage of an actors tiles in one mega texture, a percentage in another, and a third percentage on disk. On disk actors will only be loaded if you try to draw it on screen.
Blend modes are handled the same way as the old engine. If the blend mode changes between actors the batch is broken, a draw call submitted to the GPU command buffer, and we continue on our merry way. Apparently there are some bugs with blend modes, which I will have to fix and soon.
In the old system if you added too many textures the system would just eventually crash. Every actor was an individual draw call. It was using GL1.1 calls (no VBOs and VAOs) even with GLES2. Now you will never run out of memory. You just have actors that can have tiles in one of three states: 1) in tile cache, 2) in VRAM and 3) on disk. By overwhelming the system with super massive textures, which is bad, and not compositing nicely with multiple actors using the blend modes, all you'll do is slow down your frame rate because you have to evict visible tiles. On Windows memory use is better than the old system because I resize the mega-textures as the load increases from 512x512x4 up to 16384x16384x4 or 262,144 tiles in VRAM. I could not do this on iOS devices due to a bug in the GL driver which didn't flush the texels after I resized. Very annoying. Android is just damned crippled with it's GL|ES 2 implementation. It lies. The bottom line is the new system will use more memory for small levels with few tiles because large swaths of the mega-texture will be wasted. On large games it'll use far less if GameSalad could run the game at all! If you build your levels badly with massive noisy textures you will lose all the performance that we gain from batching draw calls. SSDs on mobile are fast but not that fast. I can improve with a FIFO queue but it's not in there now.
The RAM limitations are going to be slightly higher on OSX, iOS and Android because I have to allocate one big 2048x2048 (or whatever fits in VRAM, may be smaller for systems with limited VRAM) RGBA texture up front. Not four. I only allocate four if I need to. The biggest problem this solves is memory fragmentation. With the old system you could actually run out of video memory when the system reported enough space! This was because of fragmentation: lots of little textures means a lot of wasted space if you start deleting just a few of them.
No support for creating your own mega-textures. You can't mark assets that are important because you don't know ahead of time what their draw order is. The tile streaming is based on visibility per tile not per image. It would be a pain in the bum to assign individual priorities to tiles.
Imagine you had actors A, B, C, D and E with blend modes A=Normal, E,B=Screen and C,D=Multiply. You would have three draw calls if the Z order was A, EB, CD. If the Z order was A, B, C, D, E then that would be four draw calls.
Actually you store off a bit for horizontal flip and another for vertical flip with the tile in the index and there's no extra processing. Only processing is to flip and mirror in the image database builder tool. Doesn't take long, especially with SSE compares.
An ad hoc would only show you load times because Creator and the Viewer do all the work that the image database creator tool (gsidb) does. The actual usage of tiles is the same. The only difference really is that the Player loads from an image database that's been produced at publishing time and Viewer/Creator loads from PNGs. Tiles are still produced no matter what.
Also the newest version of FreeImage has gone into the nightly build so a lot of problems we were seeing with PNG images goes away.
@Toyman How will the system handle a situation where a series of actors are scrolling down the screen, wrapping around to the top again, having their image changed, and scrolling back onto the screen in a continuous loop? There would be a decent amount of shared tiles amongst the images they'd be displaying, but hundreds of different possible images.
Contact me for custom work - Expert GS developer with 15 years of GS experience - Skype: armelline.support
Thanks, makes sense - best to keep blend modes in grouped layers then !
Great, so I'm going to interpret all that as the viewer giving a fairly good representation of final performance.
Awesome, thanks for clearing that up, makes a lot of sense now.
I'm feeling a bit more confident in this new engine, and the knowledge here will help significantly in optimizations with it.
Follow us: Twitter - Website
What about backwards compatibility @Toyman ? Do you expect old render engine games to run perfectly, or will we be stuck on this current version of GS if we don't want to redo existing games? Also, what about animation? I assume this new rendering engine will have some pluses and minuses in regards to animation performance. I noticed in the latest nightly that the animation speed has increased during testing in the Creator, but decreased in adhoc builds on the device.