JeremyNicoll

Member
  • Content Count

    1748
  • Joined

  • Last visited

  • Days Won

    26

Everything posted by JeremyNicoll

  1. > I'm not really sure we're getting anywhere. The issue happened, then was resolved, and hasn't > happened again. I agree we're not getting anywhere. But I don't agree that the issue was resolved. We have no idea why /automatic/ updates failed to happen for three days, then a manual update worked. > As for debug logs, these are not a design defect. ... They're fine so far as they go, but the chicken-and-egg problem of needing logging on to find out if there's a problem, but not having logs on until after you've seen a problem has no solution if you're not willing to find a way around it. I still think that there's doubt over whether your scheduler always starts update attempts when it is meant to. If you won't add a log record to say so, another option would be to create a empty file in: %TEMP% with a name like: EIS update started at yyyymmdd hhmmss.log and delete it again when an update process runs to completion. I'm not suggesting it should go in a2temp, because /if/ a user should end up with lots of those flag files, they would be much less likely to find them if they were in \a2temp. Whenever anyone complains about an updating problem you could ask if any such files exist.
  2. Arthur, see PM for a full set of debug logs for 24th-29th, and a summary of climbing memory use over that period.
  3. > Getting a hold of an old build isn't difficult, however keeping it is. The updater will automatically > download and install the latest version unless you disable updates. And that would also prevent signature updates? I wouldn't want to do that for very long.
  4. > If you could start an update while a previous update appeared to be frozen/stalled, then > the previous update wasn't actually stalled and it may have just been a UI glitch. A three-day failure to update its signatures, AND no decent alert to the user saying that there was a problem is FAR MORE than a 'UI glitch'. It's a serious fault. I note that in another thread someone else recently complained that the tiny systray shield's colour-change is just not adequate, especially for older eyes on high-res screens - as a way of notifying the user of a serious issue. [http://support.emsisoft.com/topic/20090-eam-shows-no-protection/] > It would also double the number of rows in the update log table of the SQLite database, which > would cut the number of updates that get logged in half. This is the sort of thing that we > generally keep in debug logs rather than in regular logs. ... which would be fine if anyone ever had debug logging on when this occurs. Debug logs are no good for things that users can't reproduce. You must know that. > [killing stalled updates] That wouldn't solve the problem, and the user would be in the same > bad situation. No updates, and no clue why. Not correct. I see you snipped my suggestion that when such a 'kill' is done, it would be logged. So while, yes, it wouldn't fix the problem - but then no-one seems to understand how to fix the problem - it would not the leave the user without any clue. There would be a log entry saying that the stalled update had been detected. At least we'd then all know that these presumed stalled updates were really happening. > Debug logs are the way that developers find out where a problem in the software is. Without them, > they have no way to know why something isn't working as expected. They can't just go through > millions of lines of code and hope they stumble upon the part of the code that is causing the > problem, and just so happen to notice an issue with that code that they overlooked when they > wrote it/updated it. The only time bugs get fixed without debug logs is when our QA team can > reproduce the issue in their own testing. Then you have a fundamental design problem, don't you? Debug logs are not "the way", they're just one way of finding out why things don't work. There needs to be a middle ground, where when users never have debug logging on when a problem occurs, developers have to find another way to collect basic info - maybe not enough to pin down exactly why something is going wrong, but enough to /prove/ that it is. I lead a programming team before illness forced me to stop work. We occasionally had this sort of problem, and fixed it with a basic trace that ran all the time in whichever specific part of our product until the reported problem happened again. We were lucky (compared with you) that our customers were on a corporate network and so we were able to have our product send those small log files to us automatically and we received and filtered them automatically until one showed what we needed to know. Obviously you can't, in these days of angst about apps that 'phone home', have EIS send logs to you without the customers knowing, but it should be possible to create (if you like) an extra log that records some small details about the start and stop of the update processes. If you're so concerned about its impact on the SQLite databases put it somewhere else. It has to be active all the time so that it's active when people hit the problem. But for goodness sake, do something about this problem. > Obviously, if our management thinks that adding some of your suggestions is a good idea, then > they will have our developers work on it. Do your management know about any of my suggestions?
  5. > We already have error handling ... Adding more folders just adds complexity ... Hmmm. Even YOU agreed in your 20160318 1128 post that this area needs more work. Do the developers agree that the present situation is not good enough?
  6. If there's a way to, and if it would help your developers, I'd be willing to install an older build to see if the problem goes away.
  7. Periodically in the last couple of weeks, I've cross-checked Process Hacker's figures with that of Task Manager (in W8.1), and here, they've always agreed with each other. It's interesting that Yilee sees the problem on W7 Ultimate (on a desktop?) and on a laptop (precise OS unstated above, maybe W7?), and I on W8.1, both of us with 64 bit systems. Also, my machine has "activate memory usage optimization" turned on, and has done ever since I installed EIS on 28th Feb 2016 (I looked back to my install notes where I'd noted the settings I chose to start with). So if that setting is relevant, it was harmless until the reported start of this problem. I did a full shutdown & reboot again over the weekend, but this time turned debug logging on just before I shut the system. Possibly later today, depending on how high the memory use has climbed to (it's doing it steadily but I've not been awake so much in the last few days), I'll shutdown & reboot and then make the debug logs for the whole time: boot through to shutdown available to you.
  8. OK, just summarising where I think this discussion has got to. I think I'm waiting for the developers to answer two questions you've passed to them, namely: a) (as described in your post, which I see datestamped as 20160318 1128): - developer comments on using separate folders for each set of update files acquired to reduce chance that a problem with the current single specific 'a2temp' folder, maybe from a previously stalled update might somehow interfere with a newer update attempt b) (as described in your post, which I see datestamped as 20160421 0555): - developer comments on why cancel didn't work
  9. OK... your first & second answers make it clear that you think there was a stalled update preventing any other automatic updates from starting. But you then say that a manual (check for) updates would be impossible - it would just show the current one's progress. But right at the start of this discussion I told you that I triggered a manual update, which worked immediately. To me, that suggests that there wasn't a stalled update, but instead a scheduling problem. This is why I keep suggesting that a log entry saying that a scheduled update is about to start, would be a good thing. If stalled updates really do block other attempted updates from starting maybe your suite needs deliberately to look for update processes that started - say - more than 3 hours ago, and kill them. And, if it did do that, LOG that it found such a problem and tried to fix it. Unless the developers start trying to find where the problems lie, I don't think this will ever get fixed. I've been an EMSI customer for quite a while, and I've read loads of threads which boil down to updates not happening when people expected them to. Constantly being told that 'debug logs' are needed to solve the problem just isn't good enough. We all know that nearly no-one has debug logs active when problems occur. I'm tryin to suggest things that might help pin down where the problems are.
  10. > You have a lot of non-standard settings I'm not sure that I do, as not everything we have talked about has yet been translated into actual settings changes. I tend not to change stuff until/if I know I can devote some time in the following days to watching the effects of those changes and/or finding ways to test their impact. Since the problem seems to happen after reboots (from full 'cold' shutdowns), and as I haven't touched any setting (apaert from turning off debug logging, and there's been several reboots since then), I've copied all the .ini and ini.backup files out of my EIS install folder, used 7zip to pack those up, and PMed that 7z file to you.
  11. Yilee: > I also could use Sysinternals process Explorer if you believe it would be much better for > this situation. Task Manager should be fine > My physical memory usage is at 53 to 54%. A2servise uses the most memory out of the running > processes at an average of 5,800,200 K for working set, peak working set and private working > set with commit at around 6,375,656 K. I have been booted for about 2 whole days. Without knowing what's normal on your machine it's hard to be sure, but with no VMs and (as you said lower down) 16 GB of RAM, on a system that's only been up two days, having commit charge already at 6 GB seems wildly high to me. Also are those working set figures for the whole OS? I just rebooted (because I've shifted from one house to another and chose to turn the laptop off completely while doing that) and although - again - figures are climbing - working set for a2service here is currently only about 3 MB, not 5.8 GB. I just happened to be watching as a set of updates arrived and working set went up to about 180 MB while a2service dealt with that, but fell a few minutes later to about 3 MB again. > On the other hand, it makes sense that an EAM (anti-malware) type program would be using the > most memory. It's sensible that it should use RAM if it needs it, but remember that most of Emsi's customers won't have systems with 16 GB RAM in them, so it ought to be able to run in much less. And in that blog post of theirs, they said they can cram all those signatures into about 200 MB of RAM, so that alone shouldn't be wasting GB of virtual memory. > I mention VMware because I believe you mentioned that you were running a VM. I'm not. I used 'vm' and 'vs' in some posts, as abbreviations for 'virtual memory' and 'virtual storage'.
  12. Stapp, a 'restart' does a full shutdown followed by a reboot. If you don't want the reboot to happen, ie you really do want your machine shutdown, then it's not a good choice. /IF/ you have your machine configured to ask for a power-on password then you could turn it off at that point, but if you don't then you've still got the same problem how to turn the rebooted machine off completely. A shutdown is NOT a pretend one if you invoke it with the right options from the command-line's shutdown.exe command, OR if on the GUI you shift-click the menu entry rather than plain click it.
  13. > in the case of an update that is frozen you're still going to see evidence of that > in the update logs since there will be gaps in the update log entries Yes, if that's the cause. But how do you know that your internal scheduler properly scheduled successive attempts? Does your automatic scheduler only schedule later attempts if the current one completes ok? I mean in my situation, was my system likely to have one stalled upgrade that prevented the next 30+ being attempted, or did 30+ attempts start and get nowhere? (This is why I think it would be useful to see an attempt strating.) And, if automatic attempts are stalled, why would a manual attempt succeed? Does a manual attempt clear any flags or anything before it starts that don't get cleared when an automatic attempt starts?
  14. > W8/10 shutdowns... As far as I know, even clicking 'shutdown' in the Power Options menu doesn't do a full shutdown; I think you have shift-click it.
  15. I should have added... I'd been noting memory use figures every few hours since I first noticed a problem. As I've said above, I use 'Process Hacker' to monitor all sorts of things in my system, and I have it place a small icon on the taskbar showing physical memory use. It was the increasing (well past normal) display there that first lead me to look closely at both virtual & real memory use, and then to see a2service's growing virtual memory problem. The displays in Process Hacker of memory use (and many other values) are similar to those one can get in Task Manager (if you right-click its column titles and turn on lots of columns that are normally hidden. I'm not sure if task manager can also put tiny graphs onto the systray so you can see a sort of summary of eg cpu use, or whatever. Task manager will however show you a summary of memory use. The only advice I can really give is that you look to see what values are being shown, and see if they are growing. If they are, you'll have to make your own mind up about how nervous you get as they reach maybe 80%, 90%, 95%... of their maximum possible values. It's hard for you, if you don't know what your system's normal values are.
  16. Yilee, thanks for your thanks... When I first reported this I was using EIS v11.6.1.6315, and - actually - I'd not noticed that a newer version had come along. Maybe that's what took me by surprise, as described in post #8 above. I'm not sure that I'd want to abandon signature updates... I don't know if it's possible to keep getting signature updates but not get executable code updates at the same time. Do signature updates sometimes need revised executables to work? To be honest I'm surprised that no-one else has reported a problem. It makes me (andmaybe Emsisoft) wonder why it is that my system has the problem. Then again, maybe very very few users monitor resource use on their machines, and even if they do, might not see EIS as part of the cause, if it is. There's a possibility that EIS is a victim of a problem in the system, not a cause. Who can tell? However, for me, when memory use climbs high enough that either task manager or Process Hacker (or the broadly similar Process Explorer tool) show that the OS is gobbling up real RAM and at the same time EIS is apparently using a vast amount of virtual memory (and that's running out too) it's reasonable to expect a problem sooner or later. I don't know enough about W8.1 to know how close to total memory exhaustion one can let a system get to safely. If one lets memory use get too high, then there may not be enough to allow the system to do a controlled shutdown (I mean if I tell it to close), and if it gets even worse, you would eventually expect a BSOD.. and possible file corruption (as with any other BSOD). It may be that last night when I eventually rebooted that I could have left the machine for another few hours. Maybe another whole day. But if the system was going to fall over in a day or so, there was no point in pushing my luck. I'm sorry I can't provide a more concrete answer.
  17. What OS are they all running? Are they using any screen automation / macro software (the sort of thing that sends pretend mouse-clicks etc to the screen to automate a manual process) - software like AutoHotKey, or autoitscript, or mjtnet's macro scheduler etc? Incautiously written scripts might trigger something like this. But then again, a reboot would be unlikely to fix it.
  18. There was a long discussion about this some time ago. See: http://support.emsisoft.com/topic/18603-lots-of-files-in-temp-dir-hold-open-by-a2guard/
  19. Oh... did your developers have any ideas? This did all start after the last set of software updates for EIS. Is there any diagnostic utility they'd like me to run, or - as I don't know my way around such things - anything they'd like me to install so that they could use them in a remote access session?
  20. > It's not abnormal for there to be more memory reserved for an application than it would > normally use Indeed, that's true. But that's the difference between 'reserved' and 'committed' pages, and the task manager 'Memory' (as opposed to 'details') screenshot shows the commit charge was very high. > I think it has to do with how Windows managed virtual memory, and wanting to make sure > that if memory usage suddenly spikes for a process there is memory reserved for it to > ensure that it doesn't try to use more memory than is available. I've not read anything (and I've read a lot on VM management in the last two days; see below for URLs for a series of illuminating articles, if you've time to look at them) that suggests that the OS would reserve vm pages for a process off its own bat. The process has to ask for the pages. I hadn't originally understood every aspect of what I saw in the 'Memory' screenshot; it showed the non-paged pool using 4.7 GB. This is vm that's NEVER paged out, ie always located in physical RAM and explains why (as well as seeing EIS's Private Bytes going up & up why I was seeing real memory use going up & up). The NP pool contains the kernel, OS data structures that must be in RAM so that eg interrupts can be handled, those for mutexes/semaphores, paging control tables, etc, and storage acquired by drivers (or I suppose any other kernel-state program that asks for NP storage). By late last night the size of the NP pool had grown so big (along with EIS's Private Bytes which was by then 2.35 GB) that I rebooted. I was scared that the system would crash in a catastrophic fashion if left running. I took screenshots (attached) of the memory summary (from Process Hacker) immediately before and after the reboot (a full 'cold' reboot) and it is interesting to see that the NP pool on the rebooted system was using only 119 MB, a huge amount less than 4.7 GB! Something else I noticed when watching these displays is that the numbers of NP allocations each second is usually far greated than the number of frees. I don't know if that's normal (though I guess it might be typical of a system with an out-of-control growing NP pool). EIS's Private Bytes after the reboot was back to the figure of 486 MB, but since then it has climbed to 553 MB. I don't know if it will again climb & climb. Something odd happened just before I rebooted. I'd signed out of my day-to-day userid & in as my admin one. I did this because there was a minor backup I wanted to do from the admin id (and as I've once a long time ago experienced a windows hang after a sign-out & sign-in I try not to do that when I'm not prepared to reboot if I have to). Also I wondered if the other user would also show high memory use - it did. Anyway, as soon as that user's desktop came up (some apps, eg Dropbox, which run on my daily id don't start there, so it's quicker to start) I got an alert from EIS saying it had just done a software update and needed to restart its application. After it had done so I looked at the EIS update logs, because I was puzzled. I saw no signs of a software update having just been issued. It seemed as if it was a pending action from the software update of a few days ago. Is it right that a user session would need an app restart when it has just been logged-in, to activate a software change that was issued several days ago, which in any case had already been implemented via my day-to-day userid? The admin userid had not been 'disconnected' - I never do that - and in any case the whole system had been rebooted a few days earlier (which is when I started recording memory use). Very odd. > If you make a backup of your settings, and then reset everything back to factory defaults, > does this issue still happen? I'm not going to try that yet. I'd like to see if we can find out what's grabbing the storage. As it is, the reboot I felt forced to do last night in the interest of system (especially FS) integrity, might have lost us the chance to find out, but fortunately (or not, depending on your point of view) that looks not to be the case. Also... I've not changed any settings in EIS in the last few days, apart from - now before two boots ago - having had debug logging on for a while. Useful URLs: https://msdn.microsoft.com/en-us/library/windows/hardware/hh439648%28v=vs.85%29.aspx - a good clear overview of paging, user & system space, page & non-paged parts of latter https://blogs.technet.microsoft.com/markrussinovich/2008/07/21/pushing-the-limits-of-windows-physical-memory/ - there's a 'meminfo' tool mentioned in this that digs details out of the PFN database which records what's in the pages etc, but unfortunately it doesn't work on my system; it was after I read about this that I found the SysInternals RAMmap and VMmap utilities. https://blogs.technet.microsoft.com/markrussinovich/2008/11/17/pushing-the-limits-of-windows-virtual-memory/ - describes the commit limit - that all /committed/ vs must be backed either by ram or paging file. - describes "Private Bytes" fairly accurately (discussion in the comments shows that even Mark R's initial description wasn't quite the whole story) https://blogs.technet.microsoft.com/markrussinovich/2009/03/10/pushing-the-limits-of-windows-paged-and-nonpaged-pool/ - describes eg the non-paged pool - areas where the OS and device drivers store their data essentially everything that must never be paged out (so will be in real RAM) https://msdn.microsoft.com/en-us/library/aa366778.aspx - Memory limits for each version of Windows, eg how big can a user address space be?
  21. > Since I'm not familiar with the any the code our developers have written for the update process, I can't say for certain how they handle it. Now you say that... ;-) The thing is, these last few interactions between us have followed your statement about why the cancel probably didn't work, your inclusion of pseudo-code etc... So what were you trying to do? If you don't /know/ how the code works, why try to put me off with descriptions of what it is doing? All this followed my question to you of: "has anyone thought about which thread and how it got stuck and why it couldn't be interrupted/ terminated/whatever by your 'cancel' process"... which seems to me to be a perfectly reasonable question from a customer who was unable to get an update process to cancel. It's not as if I invented the cancel button. Your code provided it, and it was reasonable fo rme to expect it to work. So, have any of the developers given any thought to why this did not owrk?
  22. > changing notification timeout I don't think those options (Control Panel - Ease of Access - Using computer without a display - Adjust time limits - How long should Windows notifications stay open) will be relevant (a) because I'm not using the machine without a display, but also (b) because they were set to the default of 5 seconds and the pop-ups already stay visible for much longer than that (in the absence of mouse activity). > We no longer have notifications ... People must be mad, then. It's not your fault as a software vendor if the OS does not allow you to change certain elements of the OS without a reboot. (It's something I'd have thought MS would pay more attention to, since it's impossible for businesses to run continuously available systems if they keep needing reboots.) That aside, users who complain about alerts ought to be able to discriminate between annoying things that maybe can wait, and those that any sane user would want to know immediately. > Our update logs already show the start/end time of each update. If an update fails > for any reason, then it should be reflected in the logs. I think that's not the case. I think that probably your log records showing each update are only written to the log when an update completes successfully. Maybe there's some entries for partial failures. But they don't have entries describing that start of an attempted update, OR your internal scheduler is not starting them when it should. Do you think if my EIS had shown umpteen started updates for the "No updates in three days" period I'd have raised this ticket? Nevertheless, looking back at what I wrote above I see that I described the lack of update activity in the log, but didn't show you a picture, so here it is. Note no log entries at all in the period 7-9 April.
  23. I've PMed you a link to a file saved from the SysInternals VMmap program, showing much more detail about the running a2service's virtual memory.
  24. What I would expect is that when a program requests allocation of some (virtual) memory the OS would when it satisfies that make sure that it was capable of 'backing' it with either real RAM or a slot in the paging file. After all, when your program places data in that memory it has to be stored somewhere. I would not expect the OS to allocate more pages than it could back, so on my machine with 8 GB RAM and a 4.47 GB paging file, I expect (apart from a small amount of memory used by one of the graphics cards) there to be a maximum of about 7.9 + 4.47 = 12.37 GB of memory available. If an app gets virtual storage from the OS and never writes to it then that page will never actually get physically swapped to the pagefile, because that would be a waste of time - it's got nothing in it - but the OS still has to expect that one day it may need to be saved and there has to be somewhere to put it - either real RAM, or a slot in the paging file. I think that the 2 GB 'Private Bytes' size represents all the pages that have been allocated (or just maybe the sum of all the allocated areas, so the sum of the pages which contain them would be greater), and that Working Set is the subset of the total number of allocated pages which are actually in RAM at the moment. For the OS to have allocated 2 GB of virtual storage to EIS, the paging tables must have at least 524,288 (ie 512k) entries describing that large number of pages. That alone is wasting a certain amount of (I suspect non-paged) pages of RAM, though it will presumably be attributed to the system rather than EIS.