DjaySaint
It's simple actually. Imagine that game have limit of 4gb and textures are stored at hdd, then loaded to system memory, then uploaded to video memory on use. Even if you have 100gb of vram, the limit is 4gb of process memory, system memory, about 1.5-2gb already wasted and with big ugrid setting it's much more. ENBoost expand usable memory, but game still need to send textures to video memory when used and this produce stuttering, because such process runs at 1-4 gb per second and game do not have any prediction for smooth loading, it simply push everything in to vram if you turn the camera to the area which wasn't visible long time. Unsafe hack ignore copying and destroing system to vram resources, but with downsides you already know. Amount of video memory parameter is upper limit of how many video memory can be used for storing textures and geometry, it's great to have big value reported by VRamSiseTest, but it's game decides when destroy objects in video memory and i can't do anything about that. Fortunately, because of game bugs, it mostly ignore how much memory used in vram, but own game memory manager unloads/destroys resources, i guess this can be changed in game ini files.
ExpandSystemMemoryX64 crashes only if wrong skse memory manager setting are used, with crapware (rarely) and if few skse mods installed with bugs in them regarding pointer addresses. This parameter do not affect stuttering much, it's mostly ctd fix because of much lower memory fragmentation with it.
Skse memory manager + enboost is the best combo right now, but you need to set skse properly, too high values which i have seen as default cannot be used because of OS memory manager, it may fail to start the game at all (memory fragmentation occur because of dlls loaded, so process can't allocate one big block of memory).
Try to ask preset makers who have Titan cards, they probably have good setting of memory manager and game ini files, because i heard from them that running fine and smooth, not all of them though.
domjam
Just use value of VRamSizeTest dx9 version reported, it's the maximal possible. And try to modify reserved size to the bigger (may cause ctd if too big) and disable/enable driver memory manager.
Marty McFly
Forgot to mention, blurring in two passes need to be tweaked by range for each pass not just by "waves" of pattern, but to make much greater blurring than with just simple passes, because of bilinear filtering of input texture (ao/il in one texture). There are many combos of such ranges, i recommend to create sample image with pixels on it for testing, for example lines as 1, 0, 1, 0 in both hor and vert, checker, axis aligned lines, skipped pixels like 1, 0, 0, 1. If i remember, range of 1.0 for first pass and 1.25 for second pass is one of the simplest to get that kind of extra blurring (really huge, like 4 passes). To optimize performance of filter for low ranges, use precomputed mask of edges. Of course it's not work with huge blurring ranges, but i'm using modified bilateral filter, so it's ok.
And these tricks, how much performance can you get out of that?
In several times. Short code is not the best for shaders, at least not for ambient occlusion. Try to find any 3d engine which let you write shaders and to see fps impact, this is the only way to find out how things works. Nobody write in docs why it's better to avoid some instructions after another and how much space to waste for separating them, in which case some ops can be much slower, etc etc etc. And read papers of videocards architechture when new models are presented at hi-tech sites, understand which ops computed in which modules and how they are linked, how memory works - all these also good for heavy optimization (but not much, may be 1.5 times max, unless code very bad initially).
Do you use game data to determine what's ambient, albedo, lighting
No, i draw this separately, if not exist in game. That's why Skyrim have deferred mode, while initially game use forward rendering with shadow prepass (kinda deferred shadow). Gta4 and Gta5 are deferred, so i do not use anything, except math to restore data. Gtasa is simple forward like most of games and if i remember, only latest versions used deferred, but not worth it actually, game use vertex lighting, only partially ssao/ssil blending is better than without any data like albedo, lighting. To mix ssao/ssil in such cases i have used math which compute amount of ao from pixel brightness, il amount depends from versions, some multiplyadd math, some affected by ao. It's just about the look of final result, i don't see any other way. Because game is mostly bright, ao trick works, but fails at very contrast areas between walls, il then foolish the eye.