CPU time increase after long runtime

Falco shared this problem 5 months ago
Solved

In the last weeks I did some lonely MTB trips and notice a decrease of battery runtime from 17 hours to 12 hours screen on time compared to my Android ROM benchmarks, by using mostly only my high contrast dashboard with low LCD brightness.


I thought my battery was already worn out because I did modify my firmware to skip android auto shutdown and keep draining battery until the protection circuit cuts the power to get ~5% more runtime. But I could reproduce my 17 hours time in a controlled environment and even better, I could measure cpu time differences between my test case and a more real usecase by feeding the gps simulation with my real data.

First of all, there is a difference between receiving 36.000 GPS points from roughly same location and 36.000 GPS from different locations.

  1. Different positions do force map render time
  2. Different positions do rotate map
  3. Different positions do increase recording track length

I could fix the first 2 points

  • disable map rotation during dashboard usage
  • disable center position during dashboard usage

But after benchmarking this improved setup I found out that guidance does take extra cpu cycles and a fresh started locus maps does use much less cpu cycles than an instance with already some hours of runtime. I'm not sure what's the reason, because I can't execute many long-running use cases.

Don't blame track recording right now, I didn't prof it so far, but at least I found out that something is keep increasing battery drain over time. Could be a reason because calculating 36.000 points is more expensive than calculating 3.600 points.

I did four 90 minutes tests with closing locus process after each run and one 360min test with 4 segments. The 360 minutes run did keep increasing cpu times after each 90 minute segment

3371b950afc0dd993474f6dd64c65364


More context for the Data:

  • I did use a 90min long nmea file with nearly static position for the stay tests
  • I did use a 15 hour long GPX track for guidance
  • I did use a 90min long cycle ride nmea file for the move tests (does match to the first 10% of the gpx track)
  • I have sensor debug logging allways enabled to keep high precision pressure measurements with 5Hz, so my cpu times are maybe a bit higher then usual
  • All tests are done on a v3 vector map, except the online tile map test
  • measurements was done with /proc/[pid]/stat

I could send you all my Testdata (2 nmea files and the gpx track) via email if you like.

At the end of week I try to find out what happens without track recording and what happens without sensor debug logging.

Best Answer
photo

After 108 hours of testing I found the following results:

Vecor Map Rendering is amazing, your cache is working really well, if you just follow GPS and don't scroll around, there are basically no cpu cycles used at all. Its even 1% better than empty map :D

A custom Dashboard with 4 values and a elevation chart does take 9% of the overall locus power consumtion.

Track guidance does usually take 7% of locus power consumtion, only if the track is some kilometers away, it will take 10-15%


Track recording is the fuction with increasing cpu time cycles, it will take 5% of locus power consumtion per hour, after 1 hour it is already as expensive as track guidance and after 2 hours it is even more expensive then a full blown dashboard. The option to view only the last 1000 points instead of the whole recoring doesn't save any cpu cycles (or less then 1%)

The most processing is done in handling GPS positions. 85% of Locus processing time is used to update everything after getting a new GPS data.

If gps is disabled, locus does use only 15% of it's usual power draw. But on my old phone this is still 5% of the whole CPU capabilities just for idling. This sounds suspicious too, because other apps does use less power in idle state.

If you have everything enabled, it will look like this and will drain your battery by 157%, which does match to my observation where I found out that I did lost 1/3 of my battery runtime some were.

d607a8f2f7a6271b28a8404ecf913056

To put these value into an context, baseline test will use 40% of the first core of my quad core CPU. And all features will take 63%.

Android power management does a good job to put all threads on CPU0, because CPU1,2,3 did only use 1% of total system cycles.

Comments (9)

photo
1

Hello Falco,

this is really impressive testing and a very serious topic, thanks for this!!


A small problem in code may over time grow to something like this, so I'll for sure try to help with solution.

So your setup for the test:

  • NMEA simulation with 90 min long file (done over Locus Map > GPS Screen > simulation?)
  • vector maps in V3 version
  • enabled track recording (suggest to use 1m/1s/OR parameter to simply record of all points in all cases = better for stable test)
  • enabled navigation
  • enabled logging of sensor data

Miss anything?

Quite a lot of parameters at once. Should be possible to do a 90 min tests:

  • a base-line test with only NMEA simulation + V3 maps
  • test with track recording only, with navigation only, with sensors only?

With this, it should be possible to detect service that has a high impact and on which I may focus. In case of a lack of time on your side, let me know.

It should be also interesting to compare an online/raster maps to vector map base-line ... maybe in a free time ;). In the case of vector maps, it will highly depend on zoom level & movement (so generally how often app will have to create new map tiles).

photo
1

I can confirm with some notes

  • NMEA simulation with 90 min long file (done over Locus Map > GPS Screen > simulation)
  • vector maps in V3 version but did try to remove any triggers of map rendering because I don't use the map on my fullscreen dashboard
  • enabled track recording (1m or 1s)
  • enabled guidiance (no navigation, just follow line without routing to be able to see missing distance to target)
  • enabled logging of sensor data

I will create the following tests:

  1. Baseline (everything is off, v3 map but no map movement)
  2. record only, no map movement
  3. navigation only, no map movement
  4. sensor only, no map movement

I guess I need to test them twice because even with 90minutes I have some unknown results. Stay vs move test was 10% better without guidiance but with guidicance it was the other way arround.

means 8 times 6 hours, sounds like fun :D

I will mix in a v3 map rendering test (enable map rotation and enable center gps and fixed zoom level) because map rendering should be more expensive then track recording. If it isn't then we may have incredible map rendering or awful track recording ;)

photo
1

Is there a way to disable map rendering? I was not sure how many cpu cycles get wasted on connection checks every some seconds if you select an online map without connection.

I worry my gps position change may does trigger rerendering of map tiles even if my position is off screen, but I have no way to check this.

photo
1

Are you sure you want to invest so much time here? Probably yes, thanks :)

Probably best is to enable "Blank map" (in the top menu in Map manager). Most efficient way ...

photo
1

Well, If I do my 4 segments test 8 times I will of course automate the 6 hour 4 segment test because If I would miss any segment change I would need to restart a full 6 hour run

photo
photo
2

Next week I have another useless lonely week of vacation at home. I hate myself for putting so many holidays into Q2 because now I can't do anything with this free time except for pushing unusual locus use cases :D

photo
1

On today's cycle trip I did a run without map on my dahsboard screen without navigation and without recording (I do extract the recording from the sensor debug data)

And it did perform like on my test bench, maybe even better (of course, without features it should be better than on my test bench with enabled features but no movement). Today it was 32% Battery use in 7:15 hours. Would be 17 to 21 hours screen on runtime for the full battery.

My Setup target is arround use only 80% battery on a 15 hour trip (screen and locus off during lunch) to have 20% charge left for my bike light to get home. I did calculate based on my benchmarks (no movement but all features) that this goal is easy to achive.

Good to see that at least with disabled features it will be no problem at all. Next week I try to figure out which feature is eating the cpu cycles only during movement :)

By the way, ist it possible to select empty map in the preset feature? I use a non translucent dashboard preset as a cycle speedometer.

photo
1

Hello Falco, nice, thanks.

Option to select specific map is currently not in presets.

photo
photo
1

Almost done,

So far I can say the following

Track Recordng: 1-2% initial and then stacking CPU Time raise by 7% per 90 Minutes

Guidance: constant 7% CPU Time if next point is nearby and 16% if next point is ~10-20km away. That distance dependency is the reason for old unexpected stay test with guidance result, because the next guidance point was 200km away on my stay test nmea file and did eat alot of cpu cycles, maybe 30%


So we could take a look why track recording is keep increasing cpu time useage. You do have an toggle in recording setting to show only the last x points to improve performance. Should I try the effekt of this? It may have no effekt if it is just for map rendering.

photo
1

Here some details so far

photo
photo
1

After 108 hours of testing I found the following results:

Vecor Map Rendering is amazing, your cache is working really well, if you just follow GPS and don't scroll around, there are basically no cpu cycles used at all. Its even 1% better than empty map :D

A custom Dashboard with 4 values and a elevation chart does take 9% of the overall locus power consumtion.

Track guidance does usually take 7% of locus power consumtion, only if the track is some kilometers away, it will take 10-15%


Track recording is the fuction with increasing cpu time cycles, it will take 5% of locus power consumtion per hour, after 1 hour it is already as expensive as track guidance and after 2 hours it is even more expensive then a full blown dashboard. The option to view only the last 1000 points instead of the whole recoring doesn't save any cpu cycles (or less then 1%)

The most processing is done in handling GPS positions. 85% of Locus processing time is used to update everything after getting a new GPS data.

If gps is disabled, locus does use only 15% of it's usual power draw. But on my old phone this is still 5% of the whole CPU capabilities just for idling. This sounds suspicious too, because other apps does use less power in idle state.

If you have everything enabled, it will look like this and will drain your battery by 157%, which does match to my observation where I found out that I did lost 1/3 of my battery runtime some were.

d607a8f2f7a6271b28a8404ecf913056

To put these value into an context, baseline test will use 40% of the first core of my quad core CPU. And all features will take 63%.

Android power management does a good job to put all threads on CPU0, because CPU1,2,3 did only use 1% of total system cycles.

photo
1

You may have notice the huge chunk of CPU Time in GPS Processing.

I may found some very strange behavior for NMEA processing, where some useless lines can take up to 33% of the whole cpu cycles. This can't be true, I still working on some confirmation runs and because 33% is so large I should be able to reproduce it with a real bluetooth device as well. But I guess this was just a false positive or is only a bug of GPS simulation. Or even more simple: just some expensive java exceptions during parsing.

If I can confirm it, I will create a new ticket for GPS Processing questions because these results are not time dependent.

photo
1

Hello Falco, thanks for invested time and observation.

I've something to start with, nice. I'll try to setup some testing environment on emulator and let you know during next days with some news (and maybe additional questions), thanks!

photo
1

Thank you.


After additional 92h I'm at 200 hours in testing conditions in total over 36 test cases and can tell you that everything is fine with GPS Processing. As expected, the large CPU time chunk for GPS Processing is related to simulation overhead.

If you use real GPS values it will take much less CPU time. Even if you compare real GPS "RMC, GGA, GSV, GSA" vs "RMC, GGA" it is just 1% difference to baseline.

On GPS Simulation, each line will add 10% baseline difference of simulation overhead (empty:+12%, GPGNS:+10%, PGRMF:+10%)

A test with real GPS data is so efficient that the recording CPU time issue is much more noticeable.


I did update the diagram to show the real impact of the recording based on Bluetooth GPS processing.

75cad3762d8cd8d317228bf896cb6962

I did attach the updated excel file as well. And I still have the output of all 2404 /proc/[pid]/stat calls if you are interested in any other data ;)

photo
photo
1

Hello,

I've yesterday wrote own testing system in the app that is able to execute defined steps & measure CPU and battery usage over time. Based on your tests, I've focused on track recording and I think I've found an important thing.


The problem is connected to drawing a recorded track on the map and the number of points this track has. Nothing more is involved here. If you turn the screen off, or just display other than the map screen, you should get constant CPU usage.

When the track is draw, it always needs some preparation: initialization of locations, some optimizations, preparation of colors for special styles etc. Problem is, that if track changes (like new recorded location), this all has to be done again. And not only the change, but completely from scratch. In the case of more points, it becomes more and more CPU intensive operation.

Not a very easy task to be true, but definitely solvable. For now, I suggest mainly decreasing recording frequency, which may have a significant impact. I!ll try to find more time on this during May and let you know about some news.

EDIT: at least some improvement to start > found extra not-needed double-refresh of track on the map, should help, fixed.

photo
1

Are you sure?

If it would be only map drawing, the recording option "show only last x points" should result into constant CPU usage. Didn't lookup the exact translation, in german it says "Länge des angezeigten Trackabschnittes" (1000) Anzahl der letzten aufgezeichneten Trackpunkte, die auf der Karte sichtbar sind


Have a look on my Excel sheet line 15 "Record 4" Test. I did use the option to view only some points because the helptext did suggest this for performane improvements. Default value of this option was 100. And I did test it with 1000. Result was the following:


106% 115% 121% 128% 137% 146% same result as record 2, infinit cpu time raise by 7% per 90 minutes
photo
1

Yes, I'm sure. This option only limit visible points on the map, so it save some CPU needed for rendering itself. But it do not save all necessary preparations for drawing. Track still needs to be recomputed and prepared for draw.

I'll do some more tests over night to verify this. NMEA simulation is really a problem here as it has major impact on CPU usage and I have no idea why :). Will look at it ...

photo
1

Please, don't spend time in CPU time optimization for GPS Simulation, in my opinion simulation of GGA and RMC only without empty lines is already close enough to CPU time of real GPS. Would maybe more interesting if there are some wasted CPU cycles in our relative high idle footprint.


Do you meen the 1000 point filter is applied after full track compution? If yes then this implementation is nearly pointless and would confirm why I did not find any meaningfull performance improvement by this feature.

I could only imaging that there was a map scrolling issue in the past where tracks with over 1000 points would slow down scrolling on slow devices.

I thought this filter would apply before track processing.


Or is the most CPU time used to calculate track lenght and track climb? I did guess that this calculation is maybe the problem.

photo
1

Would be cool if the 1000 points filter could be applied before the expensive CPU time operation.

I am sure that this could break many things, for example track length could be effected if the filter got applied to early.

What about another solution? Of course, you can just make the recording less detailed, but this would only move the problem to an later point in time.

I thought about less frequent recording track recalcutlation: if we do this every second, we would get a 10 time improvement with 10s cycle. If you combine this with 10 times less detailed recording, this cpu time issue will apear never again if we reduce the effect by 10*10 times. But I guess the problem here is that we can't use 10s recalculation cycle as default value because the delay would be to much for some people.


Edit: If the recalculation cycle is syncron to the recording cycle (ever 10s point = recalculation every 10s) then it would be already enoght to reduce track details to get 100 times improvement. Is that the case or is recalculation allways done every second?

This would be the best solution, but that could be difficoult to implement if the old solution ist based on a simple x Seconds cycle.

photo
1

I see a solution by splitting track every let's say 1000 points and work with them separately. So old 1000 points block won't have to re-init again and only the last one will. Problem here are dynamic styles like coloring by altitude, which affect also old segments and also already mentioned "draw last X points".


In case of recording every 10s, this is not a big problem I believe. You have increase 7% after 90 minutes with 1s/1m right? Then recording with 10s frequency will have increase 7% after full day recording (900 minutes) ... and only during screen-on. Which is acceptable I believe.

May I may do some kind of block of last X points that will draw separately and whole good old recorded track that will refresh only once per time. Hmmm, I'll think about it, thanks for the useful loud thinking :).

photo
1

Please don't implement segmented track recording, this is way to complex and the risk of future errors is high for minor cpu time improvements.


There should be only 2 ways to make it less komplexx:

- process whole track with all pros and cons (the current way)

- process only the last X points (will display only part of the track)


Would it be possible to improve the existing performance improvement function"filter for last X points" before processing the track, to make the 2nd solution more useful? Because currently it will nearly consume as much cpu cycles as without it (~1 difference)


And the other question to understand why the impact of the point recording criteria is so big: how often is the recorded track currently processed?

- After every X seconds?

- after receiving a gps fix?

- after meeting the record point criteria?

photo
1

Steps are following:

Currently new GPS location is send to track recording service. This service check if location may pass all tests and if so, new location is added to track. This is moment, when track on the map invalidate itself and in next "draw" operation, is completely initialized.

So trigger to re-init is really "new point recorded" (among other things).

photo
1

That's cool, so basically, if you change it to 10s, you will get 100 times improvement because:

Process 1/10 of the points and recalculation only every 10s.


I guess it is difficult to reduce the context of the track processing from all points to last 1000 points based on the exisitng feature. Did you already found out why this feature does not change the recording time dependency to cpu time? You did already wrote that this option does only affect drawing time of the track, but didn't tell why we have prepair all track points if we just want to draw last 1000.

photo
photo
1

I did a single Test run with 6s or 1000m, this setup will update every 5s

But instead of


106% 115% 121% 128% 137% 146% 1s or 1m

I got


133% 137% 140% 143% 145% 142% 6s or 1000m

I did expect something around 25 times less CPU time increase, because 5 times less recalculations * 5 times less track points.


9% increase / ~10-20 times more efficient = 0,9%-0.45% increase every 90 minutes


But instead 0,45% I got 3% increase every 90 minutes for a 6s or 1000m recording. And the change from 1s or 1m to 6s or 1000m did add 24% more cpu time which can not be true, I do currently run a verification run with

1s or 1000m

and another one with 1s or 1s


this does look like there is another recording related process which will be triggered more often then 5s, because if the recording processing would be triggered only every 5 secounds I would at least find a 5 times improvement: 9% cpu time raise / 5 times improvement = 1,8% but I got 3% increase with 5 times less points and 25 times less recalculation effort.


But the good thing is, if you look into the excel file, recording is the only process which will add up cpu time, all other usecases did result in very consistent cpu times, not a single percent point difference between first 5 minutes and the last 5 minutes after 9 hours runtime on the other features. This is a great result.

photo
1

Maybe this issue is unrelated to recording track rendering?

Lets say the 1s recording with 9% cpu time raise per 90minutes does take

  • 3% cpu-time for calculations
  • + 6% cpu time for track preperation and rendering

My 6s recording did take

  • 3% cpu-time for calculation
  • + 0,24% for track preperation and rendering (6% / 25 times less effort)

Then you would get 9% vs 3% cpu time raise even with 25 times improved track rendering

photo
1

My 6s or 1000m run was invalid. I couldn't confirm 133% to 145% CPU time. I did make a mistake and need to retest this run.

photo
1

Now I have 2 good runs with 9 hours runtime.


CPU time did drop by 5 times from ~9% to ~2% raise per 90minutes


99% 102% 104% 106% 108% 107% 6s or 1000m
101% 104% 106% 108% 109% 108% 6s or 1000m


Currently we have the following warnings for the recording setup:

  • value is below recommended 10m
  • value is below recommended 0s

Maybe the second one should be "value is below recommended 3s" because at ~10km/h 3s is similar to 10m


By the way, why do I need to setup 6s to get recording track update every 5s? Is this an undiscovered issue?

photo
photo
2

Hello Falco,

thanks for the tests, but I really suggest waiting on the results of some of my experiments and not wasting your time on it.

I made own small testing system with an option to speed up (optimized) NMEA simulation by 10x. And here are results.

/d06c872d49ca7157a7c6c9be35d498fd

Consider all times as 10x faster, so whole single test should otherwise tool 30 hours!

Setup: emulator in AirPlane mode, all possible services disabled

Steps:

  • 30 minutes calibration, so only map visible without any activity.
  • 30 minutes only movement with NMEA simulation
  • 2 hours of track recording. At the end (so after 20 hours of recording), the track had 50.000 recorded trackpoints

Chart

  • "baseline" is the current public system
  • optimize 3 is the final best result I was able to achieve

Notes


  • seems that with this speed-nmea simulation, the device was not able to handle more than 25-30% CPU, then the map started to lag, and it seems that the emulator was on the limit. So baseline & optimize 1 are cut by the emulator in the second half
  • weird jump in 1:45 is unknown to me, was still unable to find out the reason

Well, I'm looking forward to confirmation on your side with next Beta version ;)

photo
1

Thank you.


I have this jump too, maybe around 8 real hours. Before it did appear a 4th time I thought about some weird android issue.

Last time for example I got something weird, CPU times did raise by 30% even after the usually closing locus and force close over app settings. I did reapply many locus setting, but this was not the issue. I really did need an android reboot to fix this. At first I thought about power management issue where my cpu may be stuck on a 30% lower clock speed, but even with discharging and recharging, checks with or without plugged power, it was just +30% allways until I did finally restart the system. This was the reason why my first 6s run was invalid.

photo