[Sugarcube] Odd case of performance untestability

MoLoLu · May 2015

Hi all,

Have a slightly odd case here that's got me stumped. I've written a fairly extensive sugarcube-based story - essentially a display framework written on top of the twine/SC base - which I am currently polishing up for release. It all runs smoothly except for one DOM replace passage, which for some reason hiccups on execution.

I wanted to figure out why and implemented a few performance.now() cases to test execution speed; not sure which part of the object handling is causing the slowdown but I'm pretty sure it's related to deeply nested objects & iterations somewhere in the code.

Only, when I execute the code with the timer elements in place, it executes smoothly (tested on FF and Chrome). Without them it lags visibly as the element is replaced.

Can't make heads or tails of this. Anyone have any ideas?

TheMadExile · May 2015

It wouldn't be the first time I've heard of benchmarking code altering the performance of some JavaScript (especially when dealing with the fickle beast that is the DOM). Though, performance.now() calls doing it seems odd. I assume we're both talking about the Performance hi-res timer API (window.performance), yes?

What, exactly, is this DOM replace doing?

MoLoLu · May 2015

Yep. That's the function I'm talking about.

To answer your question: the replace sets an object in a temp variable, checks a bunch of values, and if one is set, iterates over an array of child objects (never more than three in the given state). Results are printed in a table which is 2-6 entries long. I'd experienced predictable slowdowns with longer iterations before and figured that was the cause in this case too, which would have led me to redesign the code (output outside the loop rather than inside it).

Now it's gotten... weird. I can't replicate the problem - several other elements have slowed down since I made minor changes but the one in question executes smoothly; in the past this was the only one behaving badly. This circumstance is leading me to think it's just the DOM replace in general being unstable, or something to do with read/writes on a large file, and likely beyond my control.

I'm not as familiar with js as I'd like to be so I really have no idea what's going on under the hood (the added layers don't make it easier in this case). Frankly, it's a bit of a non-issue anyways, as none of my testers have complained about it. I was just fixing up poor code and was stumped why only one specific case was acting up - I knew DOM was inefficient, I just didn't realize how wildly unstable it could be.

TheMadExile · May 2015

Interacting* with the DOM is the slowest thing you can do in JavaScript, largely because of its design. Though there are usually ways to mitigate issues. Without knowing more, my guess is that you aren't experiencing instability, but are triggering suboptimal behavior by doing suboptimal things (i.e. you're poking the DOM with the wrong end of the stick).

Is this code super top secret or can you share?

* And I meant interacting. For example, simply asking for data from the DOM can trigger a reflow if updates have been buffered (to ensure that the returned data is correct, naturally).

MoLoLu · May 2015

Yeah, I've realized the issue is that I don't actually know what I'm doing. Will read up on DOM manipulation and try to come up with a solution that is less wasteful.

I'd post but the code causing the problem is... complex (spread out over several .tws files, interlinked between at least a dozen passages, and really only makes sense in context of an even larger system of around 50+ segments). Rather than trying to explain the insane logic I used to come up with the framework, it's more time-efficient for everyone if I just figure out where the glaringly stupid design decision is (there are probably several). I already have a fairly good idea.

Bit of a roundabout discussion but thanks for steering me in the right direction.

MoLoLu · May 2015

Hmm, so apparently you can't edit posts in the new layout? Anyways, if you're interested, the entry point which causes the issue is this:

<<click ''>>
<<set $HUD.currentScreen = "EquipEntry">>
<<set $HUD.backScreen = "HUD_Character">>
<<replace '#HUDContent'>>
<<set $itemTODisplay = $player.equip.chest>>
<<display 'HUD_ItemDetails'>>
<</replace>>
<<replace '#HudNav'>>
<<display 'HUDNav'>>
<</replace>>
<</click>>

I've done several tests and it doesn't seem it's the DOM replace in itself being needlessly inefficient, so the above segment won't help at all in figuring this out. A single line replace (e.g. grab element, replace with one line of text) is equally likely to be slow as a full block replace.

So my guess is it's related to the processing done elserwhere, the CSS table layout being applied, or it has to do with the rather large size of the HTML it's working on (~4mb, no elements embedded). I have a feeling it's due to nested display: tables which are frankly unnecessary given the final product. I'll keep experimenting but that seems to be the expensive operation here, not the actual element being grabbed or any of the jscript/jquery involved.

Sage · May 2015

You can edit posts, but it has to be within the first hour.
Wish I could help with the other part, but you guys are WAY above my head.

MoLoLu · May 2015

Ah. Didn't know that. Thanks for sharing

On the original issue: another round of tests and - nothing. I timed all the SugarCube functions + various bits of my calls in as many ways as I could think of being sensible and they all reported execution in 10-30ms across all browsers. Could visually confirm the items were actually being replaced in that timeframe too by breaking after exec of <<replace>>.

This should not (theoretically) be visible as more than a short pause. This is in fact the case on FF and IE provided my hard drive isn't under load (caching or AV or IO related maybe?). Strangely,there is always a visible lag on Chrome. Whatever is going on here, it's definitely something on my end. Not that I'm less disconcerted by that but okay, I'll accept that explanation.

I did however notice something else when testing on IE. With sugarcube set to single state history ; the option:

config.disableHistoryTracking = true;

any savefiles created in FF or Chrome can't be transferred to IE. It pops the following message:

"State object is from an incompatible history mode. Aborting load."

and after closer examination, it seems the entire save system is broken when single state is set, which might be related to some client setting on my side. Just wondering if you'd ever come across this issue before? For reference: I can pass the savefiles around between all the other browsers I tried (desktop + mobile chrome + ff).

TheMadExile · May 2015

It's unlikely to be broken considering the message, which is informative. I assume you're opening the file locally. When opening files locally, IE uses a different, and incompatible, history mode, thus the "incompatible history mode" message. You should find that the setting for config.disableHistoryTracking is immaterial to the error. The culprit is opening the file locally in IE. Serving the file via HTTP (from a normal web server or something like Mongoose) should make the issue disappear.

Back to the other issue. Is this happening for you with your browser's developer tools closed? I ask because the tools add a significant amount of overhead to the running page, so if you're only seeing the issue with the tools open….

MoLoLu · May 2015

About the IE issue - thanks for clarifying. I figured it was something along those lines. Just wanted to make sure I wasn't doing something stupid.

To the original point, isn't the tools; the same issue occurs with or without debugging interfaces & tools active, and again only noticeably on Chrome. Similar stuff does happen in FF and IE, which is expected given the complexity, but only when other factors come in like system load, tab volume, debugging, or the fact the cache is clogged beyond measure. Running on a 'fresh' browser, only Chrome is impacted.

Also, on chrome the performance hit seems (oddly) to relate to build order of the source. If I add or remove DOM elements or js/jquery code, the slowdown may occur in a slightly different place - the location is consistent for that element until I rebuild. It is always caused by a replace action but not necessarily related to the element/code I added or removed, which is starting to make me think it's a chrome render issue and not (or not entirely) related to the code in question.

Going purely by what I see, it looks to me like the chrome renderer is hanging up for a second as it recalculates the page. It doesn't leave elements dangling or anything, nor does it display half loaded html or clear the page or any such oddity, and JS execution isn't impacted in any way. The whole window just locks up until it displays the new layout.

Thinking aloud now since I'm utterly at a loss, I wonder if there is some sort of inefficiency in chrome's reflow mechanism, or possibly in how it builds the render tree. Coupled with the way I am building this specific UI:

- base passage displays layout
- various subpassages are 'displayed'
- click macro calls 'replace' + 'display' on specific element

then it could be that I'm hitting a case where Chrome is being forced to reflow the entire document over a small change triggered somewhere after the initial reflow (either by one of my passages or by some css condition), which could lead to it hanging up as it re-builds an already half built tree. It's entirely possible that that same change doesn't force a reflow in FF & IE, and just gets added to the queue, while chrome handles that particular case - perhaps in combination with other things going on - differently. If this is the case, then finding the cause in such a large source will prove a minor nightmare. But I'll keep looking.

Edit: I've been timing various events with the chrome tools to see if I can get any insight but according to those it's running smoothly (even though it's visibly not). The 'high' load is still on scripting with its 50ms latency (calling all the objects from the wiki store, I presume) but it still hangs very visibly on displaying elements, which isn't reflected anywhere (event timer says render is taking less than 2ms whereas my eyes say it's getting stuck long enough to be visible).

Claretta · May 2015

I get a lot of visual lag on Chrome if I try to replace background images at the same time as it tries to load a new passage.

I fixed it by preloading the backgrounds with JavaScript (which leads me to think Chrome was trying to reload the backgrounds each and every single time, instead of store them in chache for some reason), but to be safe I still set timings so that my complex tasks like swapping backghrounds happen about 10ms after a passage loading so Chrome knows how to order things.

I used to like Chrome, too.

MoLoLu · May 2015

Hmm, I'm not explicitly replacing any backgrounds but the events affected do involve backgrounds & images in general (including a body background and image border which, theoretically, should never be reloaded as they're static elements wrapped around the passages).

If chrome has trouble with any of these operations that might explain why I'm getting more slowdown on these passages vs. those with primarily text as content. I could imagine it's loading the images and re-loading the rather sizable background + image border, which would predictably result in the lag I'm experiencing. Though if it is really that then I seriously WTF at what the devs were thinking.

Anyways, I'll try manually pre-caching once I'm done writing up technical specs and see whether that has any effect on performance.

MoLoLu · May 2015

Oh. My. God. Someone shoot me. I figured it out.

I was running Math.round() in several print calls, which I'd even realized was an issue and had started fixing (which is why it worked on some passages). Of course, halfway through I got sidetracked and forgot about it. There's even a comment in one of the affected passages reminding me to finish doing that but... ugh.

/facepalm

Yeah. This actually makes sense now.

The original issue was really the for-loop + replace as I at one point suspected. The further issues were related to math.round calls, leading me to believe there was in fact something else going on.

Howdy, Stranger!

Categories

[Sugarcube] Odd case of performance untestability

Comments