the Sim Settlements forums!

Register a free account today to become a member! Once signed in, you'll be able to participate on this site by adding your own topics and posts, as well as connect with other members through your own private inbox!

Long Save Research Tool

EDIT: it was a long save... 11min long. Let me see how far I can push this... turning chap2 off on this save xD
Looks like you're the current holder of the longest-save record. Thumbs up for keep playing this way and testing.
I also have the log (posted earlier) with SS2 looking in the default Steam location only. That is with Chapter 2 installed.
It's understandable you would have those kind of errors without Chapter 2, still the location where it is looking is a ? to me.
 
Looks like you're the current holder of the longest-save record. Thumbs up for keep playing this way and testing.
Unfortunately, someone reported a save that took the whole night :explode: . But I'll keep testing! This bug is driving me crazy, I need to know what it is, even if I'm not the one who finds out xD

I also have the log (posted earlier) with SS2 looking in the default Steam location only. That is with Chapter 2 installed.
It's understandable you would have those kind of errors without Chapter 2, still the location where it is looking is a ? to me.

You're right. After I reverted my profile the papyrus went back to normal, looking for things in the correct place.
 
Unfortunately, someone reported a save that took the whole night :explode: . But I'll keep testing! This bug is driving me crazy, I need to know what it is, even if I'm not the one who finds out xD



You're right. After I reverted my profile the papyrus went back to normal, looking for things in the correct place.
Honestly I’m glad you guys are putting in the effort this stuff is outside my pay grade, just looking at some of these files gives me migraines. I really need to take some basics of coding book reading or something it just doesn’t interest me in the slightest though.

either way keep up the good work I’m rooting for you guys to find this stuff.
 
Played for about a hour, tinkering around with Grey Garden and Sunshine Tidings settlements. Had a manual save take about 15 seconds, and then tried to exit the game the normal way and the auto save took about two minutes. I have a small load order and only Chapter 1 installed. This install has never seen chapter 2, just 2.0.0c - 2.0.0k of Chapter 1. Attached my Papyrus. Many stack dumps. My gut tells me the long save issue is script related. Maybe not by a single script by itself. Solving these stack dumps, might solve the issue.
 

Attachments

  • Papyrus.0.txt
    139 KB · Views: 5
Last edited:
Ok, I'm going to start looking over the 'stack dump' section more closely and see if I can break this stuff out a bit differently. I'll be working with the 'assumption' that 'stack frame count' (or it's index) will increase with each successive call from a given script and therefore be LIFO in structure. However, the dump itself will depend on how the VM dumps it. For that I'll assume a standard 'queue' where the newest entry is at the bottom of the 'list' (or think round-robin on the 'active' scripts), and it's being dumped or reported LIFO.

Another way I'm approaching this is 'active' means 'active' and 'suspended' means either waiting on another 'suspended', 'script' or 'active'- where the VM moves the return script frame wise from 'suspended' (which really means 'active', just not currently executing in the VM).

If anyone can think of something else let me know... for this rabbit hole I'm wearing a parachute!
 
Had a random thought this evening while researching this - if this is an issue with stack dumps (ie. too many scripts running), then Skyrim will have the problem as well... sure enough: https://github.com/Wildlander-mod/S...-quicksaves-not-recommended-for-use-in-ultsky

That simultaneously gives me peace of mind, as I didn't know what I would do if the issue were the esm changes in 2.0.0, but also means I've got some serious coding optimization sessions ahead.

@Mystical Panda It seems like you're getting closer to have a piece of software/methodology that can analyze this stuff in realtime and figure out which scripts are causing all this traffic. Having some starting points would definitely help. In addition, if you can get it streamlined enough to distribute to others we could widen the net to find out which scripts are causing all the backup.

Does any of you have any "long-saves" with a bunch of ActiveScripts? I've noticed that since we fixed the spotlight bug outside GNN, the saves folks are sending me for the long save issue tend to have pretty minimal numbers of ActiveScripts - which is what I would normally look at to trace back where the backups were occurring.

All that said, one of the things that still has me questioning the "too many scripts" theory is that I occasionally get the long saves while doing my Let's Play during the Q&A sessions. During that part of my stream, I set the gametime to 0.1 so that the weather doesn't change, which means the timers effectively never trigger to run all of the systems in SS2. When checking those saves in Fallrim tonight, they all had 0 ActiveScripts.
 
One thing I'm looking at with 'script dumps' is... taking them from the 1st reported to the last what calls were being made frame wise. As I understand it when a script makes an intrinsic (another script or engine) function call it allocates a new frame for that particular 'stack'. This frame allows transferring information to and from the scripts and game engine itself.

Since 'long saves' seem to be random (at least some of the times), I'm thinking it's more of a synergy or combination of scripts or calls happening rather than just too many verbatim. I get script lag on many game sessions, but no 'stack dumps', which some of this is related to driving a 1060 tI to 1440p instead of a targeted 1080p. If script time-slices (VM wise) are tied to the frame rate, less time would result in more clogging (so to speak) as larger scripts will take more cycles to complete as there is less time to process each script or stack frame. If this starts to add up by adding too many scripts (running new or calling functions) in too quickly, then it's going to overload the VM and dump.

Initially, looking at the 'dumped stack' frames, it seems the last thing they called might point to the 'overload', resource collision, or something. This is the tricky part- if the 'stack dump' we see in the papyrus log is FIFO then what was getting called (LIFO frame- waiting on another script or frame) could be what lead to the 'long save' situation, as colliding with another resource, actively running 'stack' or 'stack frame' or waiting for a shared resource such as an npc, player, object, workshop inventory, etc., to be made available.

During this (screenshot #6) 'stack dump' you see 200+ instances of '.onload' from two different scripts, which I'm guessing the 'onLoad' event happens each time a cell is loaded? Which would happen while moving about the map, fast traveling, etc., which could effect auto-saves, however when it happens while saving after sleeping and exiting the pip-boy I'm not sure. So that part doesn't add up unless the game engine does a hidden 'onLoad' again to use a "simplistic time changed, let's load everything again and pop in what's new- Beth is big about taking simple systems and over applying them" instead of just making a completely separate function.

Speaking of scripts, the overall spread is about 88% SS2 scripts (if I'm correct), not counting the GlobalVariable.GetValueInt(). So, just thinking out loud here, something causes one of the frames to wait, while other frames which depend on nested calls are stacked behind it.

I'll keep looking over the papyrus log and figure out how to use this to our advantage, if it's possible...
 

Attachments

  • screenshot-6.png
    screenshot-6.png
    1.9 MB · Views: 9
Sometimes a script will get paused for saving, and that script won't be able to recover properly from it's paused state.
I wonder what would happen if a script (or stack frame) is in the process of being paused and to pause it it must first reach the end of the function it called and before the called function ends, it calls yet another function. That type of situation could lead to a 'freezing' and over the suspended limit as each of the frames are waiting for the function they called to complete before they can be 'frozen', which would work it's way back up the stack chain but can't as there are too many recursives- and it dumps as they are allowed to stack more calls in outside of the initial 'freezing' of the VM from newer scheduled script executions, or event processing.
 
Trying to narrow down things further, I loaded the required mods / setup for the City Plan Contest together with the latest hotfix.
Placed 2 Asam boxes of plots in one go at Abernathy Farm plus like 15 settlers to start building. Had the same result as previous tests. Maximum save time was 5 to 10 seconds and that where only a few. Most of them <1.
I'm also not standing in one place, but continues running in the Settlement and pressing the save button at random intervals, go in to the pip-boy to trigger auto-save. While they all building and already started complaining about the food situation. So that part works totally fine.
System specs.
MSI MPG Trident 3 - i7-10700 - 16GB - NVIDIA GeForce GTX 1660 Super 6GB VRAM + 512GB M.2 + 1TB M.2
Monitor 34" Curved 21:9 Resolution 3440x1440 @ 144Hz Refresh Rate
 
Speaking of scripts, the overall spread is about 88% SS2 scripts (if I'm correct), not counting the GlobalVariable.GetValueInt(). So, just thinking out loud here, something causes one of the frames to wait, while other frames which depend on nested calls are stacked behind it.
This makes sense. If you check CPU usage and i/o during a long save, you'll see that CPU is minimal (less than 5% in my case), and i/o is zero. Since the beginning, I have suspected an active wait problem. Maybe the thread is waiting for some flag that never gets changed, or, as you theorized, some function call loop that eventually gets cleaned in the dump.

My problem with the dump theory is that I have dumps without long saves and long saves without dumps (and dumps with long saves also ofc). Last night when I got the 20k line dump, a few minutes (not seconds) after fast travel, I felt the engine hang for 2 or 3 seconds. As soon as I felt this, I quicksaved, and it was a fast save (it is in the attachment of my previous post).

I had a crazy idea this morning, but it is a lot of work for the devs. @kinggath, you can fork the git on the last patch for 1.1.0 and then you merge on that fork the release 2.0.0 and see the changes. Since everything points to the problem being in the scripts, this will show you everything that was changed from the last 'stable' version to the first conflicting one. I suspect this merge will be HUGE, as it will jump over all the internal testing versions you guys had, but the change that caused the problem will be there. It is possible to narrow it down a bit. From my experience, the long saves are caused by building, upgrading and/or scraping plots, especially those that affect virtual storage (industrial and conversion plots and storage containers). None of this is 100% replicable, but it's consistent on my current playthrough (again, the save is closing the three days mark, and it was a new game on the 2.0.0 release). If you are going to do this, I don't know how much the community can help you, as I suspect some internal knowledge of the mod is required to analyze the merge. Anyway, if you are willing to share it, you can count on me taking a look at it.

Trying to narrow down things further, I loaded the required mods / setup for the City Plan Contest together with the latest hotfix.
Placed 2 Asam boxes of plots in one go at Abernathy Farm plus like 15 settlers to start building. Had the same result as previous tests. Maximum save time was 5 to 10 seconds and that where only a few. Most of them <1.
This is so weird. I've tried something similar, and as soon as I placed the first industrial plot (the one from the quest), I got a long save. Other people reported long saves without doing any SS2 content, but I couldn't replicate this. Please keep updating this. Let us know if and when you get a long save :)
 
@nosfsos by rolling back and existing game to v2.0.0 and removing chapter 2 kinda invalidated your results. All objects instantiated by the newer version will have the newer version's scripts baked into the save and any running scirpt functions will try to complete using the save baked version of the script. Most of the errors in the log are caused by this. The more accurate test would be to create a new profile with just WSFW and SS2 v2.0.0, running the scripts loose and starting a new game. You can use the fast start mod in my sig to make it less painful.. ;)
This is weird, WSFW was trying to find a file in my g drive (which is my work virtual google drive), while the game is installed in the e drive (M2 SSD). Also, there are loads of references looking for things in the default steam instalation folder, but I changed it.
1642587890706.png
If you are talking about the path for DLC06:WorkshopPowerbikeScript is a vanilla Vault Tec DLC script, not WSFW. I can't remembber why, but script path displayed is the dev source path. IIRC, its baked into the compiled script.
 
Another way I'm approaching this is 'active' means 'active' and 'suspended' means either waiting on another 'suspended', 'script' or 'active'- where the VM moves the return script frame wise from 'suspended' (which really means 'active', just not currently executing in the VM).
From what I understand, any time a function is waiting for a return, it is suspended. For example, a simple Utility.Wait(x) will suspend a function for x seconds.
I get script lag on many game sessions, but no 'stack dumps', which some of this is related to driving a 1060 tI to 1440p instead of a targeted 1080p. If script time-slices (VM wise) are tied to the frame rate, less time would result in more clogging (so to speak) as larger scripts will take more cycles to complete as there is less time to process each script or stack frame.
As long as your video card can drive the target fps of 60, you should have no issues. IIRC, fps = cycles/sec the VM runs. The amount of time the VM gets per cycle is fixed to the value defined in the ini. (IIRC, fUpdateBudgetMS)
Initially, looking at the 'dumped stack' frames, it seems the last thing they called might point to the 'overload', resource collision, or something.
Unless the power of Todd compels otherwise, "Stack too deep (infinite recursion likely) - aborting call and returning None" should be logged.

"Stack too deep (infinite recursion likely) - aborting call and returning None"​


Too many functions have been called without first returning from them. This usually happens due to infinite recursion in a script, but could also appear in highly complicated scripts that have lots of deep function calls. Try simplifying the script, removing the recursion, and making sure that any recursion has a termination condition.
During this (screenshot #6) 'stack dump' you see 200+ instances of '.onload' from two different scripts, which I'm guessing the 'onLoad' event happens each time a cell is loaded?
OnLoad
Event called when the object is "as loaded as it can be" - meaning:
  • If the object is enabled, it will have completely loaded its 3D and be ready to animate or otherwise manipulated without issues.
  • If the object is disabled, it will receive the event when it enters the loaded area but obviously will not have its 3D.
  • The event is ALSO called when an object is enabled within the loaded area - in which case it will have its 3D when the event is fired.
 
@nosfsos by rolling back and existing game to v2.0.0 and removing chapter 2 kinda invalidated your results. All objects instantiated by the newer version will have the newer version's scripts baked into the save and any running scirpt functions will try to complete using the save baked version of the script. Most of the errors in the log are caused by this. The more accurate test would be to create a new profile with just WSFW and SS2 v2.0.0, running the scripts loose and starting a new game. You can use the fast start mod in my sig to make it less painful.. ;)
Thank you so much for clarifying. I usually go with Start Me Up for the extra perk options and only skip the pre-war part. I do the vault escape and SS2 usually gets started still inside the vault. I'll set up a new profile on MO2 and try a new game.

@nosfsos by rolling back and existing game to v2.0.0 and removing chapter 2 kinda invalidated your results. All objects instantiated by the newer version will have the newer version's scripts baked into the save and any running scirpt functions will try to complete using the save baked version of the script. Most of the errors in the log are caused by this. The more accurate test would be to create a new profile with just WSFW and SS2 v2.0.0, running the scripts loose and starting a new game. You can use the fast start mod in my sig to make it less painful.. ;)

View attachment 15123
If you are talking about the path for DLC06:WorkshopPowerbikeScript is a vanilla Vault Tec DLC script, not WSFW. I can't remembber why, but script path displayed is the dev source path. IIRC, its baked into the compiled script.
So papyrus always shows the dev path instead of the actual path? Not much useful Todd!

1642590952626.png

Here is a small example with an SS2 script. It triggered when I tried to talk to a Nightingale medic roaming near Jamaica Plan. Once again, that path does not exist in my PC. Once I noticed that, I tried a quick save and got a short long one (just 1min, which is still longer than it should take).
 
I wonder if there is significance in the fact the VM was frozen and it took ~1 min from the save call to the thaw VM call? :scratchhead
 
My problem with the dump theory is that I have dumps without long saves and long saves without dumps (and dumps with long saves also ofc). Last night when I got the 20k line dump, a few minutes (not seconds) after fast travel, I felt the engine hang for 2 or 3 seconds. As soon as I felt this, I quicksaved, and it was a fast save (it is in the attachment of my previous post).
I totally agree with this. I've had 'stack dumps' with Skyrim and never had 'long saves', and with a session test here resulted in a 'stack dump' well before a 'long save'.
 
I wonder if there is significance in the fact the VM was frozen and it took ~1 min from the save call to the thaw VM call? :scratchhead
That's a constant in all long saves I've seen reported in this thread. It freezes, hangs on saving and thaw after a few minutes. For me is anywhere between 1min to 20min for long saves and less than 2sec for fast saves.

So, I know how to code, c# and unity3D being my speciality. Using that as a knowledge base, this is my interpretation of what's going on. Please, anyone, feel free to correct me:

The Bethesda engine seems to freeze the main thread (which I believe is what they call the VM where the game is running), take a snapshot while is frozen, where it is preparing the data for the save file, then write to the file and release the VM. I can be completely wrong, since I never dabbled in the CK or CE, but it is what it feels to me. If it was me, I would put the save process outside the MT so it wouldn't freeze the game and just output to the user "saving..." and "saved". But that's not what they did and what's been reported for much more knowledgeable people than me here, is that the save process is hidden from both papyrus and the CK. That's one of the reasons why it is so hard to debug it, it is essentially a black box.

The current information gathered is as follows:
  • it seems to be related to scripts instead of ESMs (whatever that means... once again, I know nothing of this engine)
  • it is caused by changes made from SS2 1.1.0 to SS2 2.0.0
  • Chap 2 mod doesn't seem to affect the saving time
  • changes inside the settlement can trigger long saves
    • building plots
    • upgrading plots
    • scraping plots
    • building storage units
    • upgrade city plans (which is just all of the above)
  • There are huge stack dumps from script lag being reported by papyrus log, and most of the entries in the dumps are from SS2
  • Reverting the scripts to 1.1.0 while keeping the ESMs from 2.0.0 AND starting a new game seems to solve the save file (I couldn't replicate this, cause of what you explained of the scripts being baked on the save file itself, but others reported huge improvements)
  • While comparing 2 saves using ReSaver, a long save is virtually identical to a fast save
And I think that's it. If anyone has more info reply and I'll edit this post. Might be useful to have a summary of what's happening :)
 
From what I understand, any time a function is waiting for a return, it is suspended. For example, a simple Utility.Wait(x) will suspend a function for x seconds.

As long as your video card can drive the target fps of 60, you should have no issues. IIRC, fps = cycles/sec the VM runs. The amount of time the VM gets per cycle is fixed to the value defined in the ini. (IIRC, fUpdateBudgetMS)

Unless the power of Todd compels otherwise, "Stack too deep (infinite recursion likely) - aborting call and returning None" should be logged.



Thanks for the good information and helping me slowly get an overall understanding of the VM. For me, this is going to take awhile.
 
So papyrus always shows the dev path instead of the actual path? Not much useful Todd!
Ah yeah, it shows the path of whomever's PC last compiled it. Perhaps a way to make it quicker to know who last edited a script on their team without having to go check source control.
 
That's a constant in all long saves I've seen reported in this thread. It freezes, hangs on saving and thaw after a few minutes.
I wonder if the the long saves where there aren't stack dumps just have a significant amount of references tangled up in the scripts that are running.

For example, does it take longer to freeze 100 running scripts or 1 script involving 100 different references? We don't really know, but that seems reasonable that both scenarios could be problematic - and there are plenty of times where SS2 is iterating over tons of items.
 
I had a crazy idea this morning, but it is a lot of work for the devs. @kinggath, you can fork the git on the last patch for 1.1.0 and then you merge on that fork the release 2.0.0 and see the changes.
Ha - I actually did do this last night, but its a wreck to look at that stuff on Git, so I was looking for some text comparison software to handle it. I generally use WinMerge, which usually works great - but for some reason the files downloaded from the 1.1.0 commit have something off about them and WinMerge is flagging every line in the file as changed. So about to test Notepad++ compare plugin...
 
Top