The Notebook Review forums were hosted by TechTarget, who shut down them down on January 31, 2022. This static read-only archive was pulled by NBR forum users between January 20 and January 31, 2022, in an effort to make sure that the valuable technical information that had been posted on the forums is preserved. For current discussions, many NBR forum users moved over to NotebookTalk.net after the shutdown.
Problems? See this thread at archive.org.

    So why does windows only use 13% of CPU for single thread on a quad core?

    Discussion in 'Windows OS and Software' started by HopelesslyFaithful, Nov 14, 2013.

  1. HopelesslyFaithful

    HopelesslyFaithful Notebook Virtuoso

    Reputations:
    1,552
    Messages:
    3,271
    Likes Received:
    164
    Trophy Points:
    0
    Alright i have made many statements on this in various threads about my frustration of not understanding why a quad core with hyper threading only uses 1/8 of the CPU for single thread operations. Would I see an improvement if it wasn't hyper threaded? I decided to test something real quick to see if it is a glitch in Task manager or it really only uses 13% of the CPU. I personally would get ride of HT if i could since i do a lot with single threaded programs and this really kills me. I just did TS bench and there is a tangible difference between threads. Here is a screen shot. Please someone enlighten me on why single thread blows on a HT CPU. How can HT butcher it so badly.

    Threads and usage
    1) 13%
    2) 25%
    4) 50%
    8) 100% (90-99% used background programs take up 1-10% during test, which makes me believe why 8+ threads isn't twice as fast as 4 threads)
    12) 100% (90-99%)
    16) 100% (90-99%)

    it also maintained ~3.8GHz throughout all tests. I also won't get into why won't intel's magical 1/2/3/4 core multithreads won't work. I have tried in the past to even rig it to work on both my 920xm and 3720qm to get it to work but never does. I'll do 40 26 26 26 and nothing. It just runs at ~26-27 across the board or on my 920xm 27 27 15 15 and it runs at ~15-16x -_- different topic another day.

    TS bench scores.png

    If you multiply the 8.5s test my .9 you get ~ half of the 4 threaded test. So why does a single thread test, 2 threaded test, and a 4 threaded test run at ~50% of what it should?

    I have also witnessed TS report 43x multiplier on my 3720qm too lol

    here it is for giggles
    wtf 4.3 ghz lol resized.png

    Why does technology never work right for me -_- :'( I swear something is always glitched out.
     
  2. TANWare

    TANWare Just This Side of Senile, I think. Super Moderator

    Reputations:
    2,548
    Messages:
    9,585
    Likes Received:
    4,997
    Trophy Points:
    431
    For TS bench this looks about right but your 8 thread seems a bit long............

    TS_32m.jpg
     
  3. SL2

    SL2 Notebook Deity

    Reputations:
    829
    Messages:
    1,340
    Likes Received:
    266
    Trophy Points:
    101
    A single thread operation uses one CPU thread, in your case 1/8 of the CPU, or 12.5 %.
    Two thread operations uses two CPU threads, in your case 2/8 of the CPU, or 25 %.

    etc...

    To utilize 100 % you must use 8 CPU threads, which is impossible with anything less than 8 thread operations, which you've already confirmed with your numbers.

    This is normal, my 2600K works the same way.
     
  4. HopelesslyFaithful

    HopelesslyFaithful Notebook Virtuoso

    Reputations:
    1,552
    Messages:
    3,271
    Likes Received:
    164
    Trophy Points:
    0
    You missed the point. Why have HT if it causes everything to run at 50% speed to gain a few percentage points on a highly threaded operation



    The point is I have 4 cores...4 CPUs. If I had HT off I could get twice the single 2 and 4 thread performance...why does HT nerf a CPU. Why can't a single thread operation use a whole processor. From my I am seeing HT hurts more than it helps



    So a non HT CPU or one with it turned off will offer twice the performance?
     
  5. ajnindlo

    ajnindlo Notebook Deity

    Reputations:
    265
    Messages:
    1,357
    Likes Received:
    87
    Trophy Points:
    66
  6. SL2

    SL2 Notebook Deity

    Reputations:
    829
    Messages:
    1,340
    Likes Received:
    266
    Trophy Points:
    101
    It doesn't work like that.

    If you have 4 CPU threads, each thread makes up for 25 % of the CPU.
    If you have 8 CPU threads, each thread makes up for 12.5 % of the CPU.

    Running 4 thread operations in the first case makes the CPU run at 100 %, but in the second case it runs at 50 % because it only uses half of the threads available.
    Do the math.

    No.

    Turning HT off gives you 4 threads instead of 8, so a single thread utilizes 25 % instead of 12.5 % of the CPU.
     
  7. SL2

    SL2 Notebook Deity

    Reputations:
    829
    Messages:
    1,340
    Likes Received:
    266
    Trophy Points:
    101
    Because you don't have a single threaded CPU.
    By your reasoning, a single core CPU is the best choice, because it always utilizes 100 % of the CPU with one single thread operation.
     
  8. RCB

    RCB Notebook Deity

    Reputations:
    644
    Messages:
    1,065
    Likes Received:
    103
    Trophy Points:
    81
    What moves faster at rush hour at 70Mph max:

    An eight lane or a 4 lane or a 2 lane or a 1 lane highway.
     
  9. TANWare

    TANWare Just This Side of Senile, I think. Super Moderator

    Reputations:
    2,548
    Messages:
    9,585
    Likes Received:
    4,997
    Trophy Points:
    431
    His point is based on the fact a single core without HT will be 2x as fast, this though is not true. It is proven that a core with HT enabled is faster overall for anything with more than one thread with little to no penalty to a single thread. The secondary hyper thread only uses it's cycles where required and the primary thread for the core gets full usage. Now task manager reports load per thread, if the first thread is at 100% load then reports the load as 12.5% for the entire CPU. this is the same for each of the other 3 primary cores and then the other 4 HT'd cores.

    Edit; to show you on the second graphic I've added I disabled HT in bios. You can see where 4 threads even takes a slight penalty as the system chores are now relegated into the worker cores............

    View attachment 104935

    TS_32m_NHT.jpg
     
  10. ajnindlo

    ajnindlo Notebook Deity

    Reputations:
    265
    Messages:
    1,357
    Likes Received:
    87
    Trophy Points:
    66
    Judging cpu usage by looking at thread usage seems misdirected to me. This is like saying you only use 25% of your car when you drive alone. What does that have to do with performance? Not to mention, how can you use half a core? You can't use half a core, which means you can't use 12.5% of the cpu. Hyperthreading does not give you more cores, it just gives you a different way to feed them in a multitasked/multithread enviroment.

    So does hyperthreading speed up programs? The answer is mostly no. See this testing done on hyperthreading http://www.tomshardware.com/reviews/Intel-Core-i7-Nehalem,2057-12.html

    As for the highway analogy, the highway only has four lanes, i.e. four cores. Hyperthreading just reduces the gaps between cars in one lane. And if you think of one program running, then that program will use one lane. That lane could be jammed up, while the rest of the freeway lanes are wide open. If the program is multithreaded, it could use more than one lane, or could use that one lane more efficiently. The freeway has its advantages when more programs are running, thus more lanes are used. Most games only use a couple of threads, which you don't need hyperthreading for.

    So back to the highway analogy. If you are only using one lane, and the rest of the freeway is empty. Then the cops will increase the speed limit. So in theory one program will run faster if it is running by itself. Start running more programs, and you use more lanes, then the speed limit drops. Of course by speed limit, I am talking about the clock speed of the cpu. That limits how fast each thread, or program if single threaded, can run.



    The video I posted above gives a simpler explanation of hyperthreading using food and mouths.
     
    Qing Dao likes this.
  11. TANWare

    TANWare Just This Side of Senile, I think. Super Moderator

    Reputations:
    2,548
    Messages:
    9,585
    Likes Received:
    4,997
    Trophy Points:
    431
    Actually using a highway analogy think of the quad core as four lanes at all times. Now without HT we are all in economy cars taking up the four lanes. Now with HT we have eight people on motorcycles using those four lanes riding two abreast. Now nomenclature will occasionally give hick ups and maneuvering can be tricky at times but if all goes smoothly we can almost double people able to travel the road.................
     
  12. RCB

    RCB Notebook Deity

    Reputations:
    644
    Messages:
    1,065
    Likes Received:
    103
    Trophy Points:
    81
    I was going to ask about the effect that would have.


    Then you sorted it with the highway motorcycle analogy:
    Though I got what he was saying with the food analogy, it's too hard for me to reproduce easily:
     
  13. HopelesslyFaithful

    HopelesslyFaithful Notebook Virtuoso

    Reputations:
    1,552
    Messages:
    3,271
    Likes Received:
    164
    Trophy Points:
    0
    I remember reading an article that HT at best gives a 10% boost in performance in certain cases otherwise it only adds a little bit of extra performance and in some cases cause a slight reductions. I forget where the article is. It was a while ago we were discussing HT in some other thread a very long time ago actually.

    I still find it hard to believe that HT somehow increases performance by 100%....i have NEVER heard HT improving performance by 100%. Something has to be off.


    BTW your 4 threaded one is irrelevant....there is always a decent variation in numbers between tests.
     
  14. TANWare

    TANWare Just This Side of Senile, I think. Super Moderator

    Reputations:
    2,548
    Messages:
    9,585
    Likes Received:
    4,997
    Trophy Points:
    431
    His food analogy is a bit more on point of HT but you then have to understand CPU cycles and latencies etc.. It does not however explain the 12.5% where the vehicle analogy does a better job there. These are all analogies so they will always just explain some of the issues and according to what you are looking for they can at times confuse more than explain.......................

    Not really. On a truly well optimized multithread, like the TS bench, it can make quite a difference. Most things though are not this way. The performance improvement is there but not as high. These worker threads are small with a high latency factor so a HT multicore is optimal.

    To real life, the cores cache structure along with other tweaks since the early days of HT has improved efficiency. This includes NUMA along with a host of other goodies. Just do not expect a 100% improvement and in fact in some cases still a loss in performance over a dual core with HT running at higher clocks.
     
  15. HopelesslyFaithful

    HopelesslyFaithful Notebook Virtuoso

    Reputations:
    1,552
    Messages:
    3,271
    Likes Received:
    164
    Trophy Points:
    0
    just posted at the same time as you lol
     
  16. TANWare

    TANWare Just This Side of Senile, I think. Super Moderator

    Reputations:
    2,548
    Messages:
    9,585
    Likes Received:
    4,997
    Trophy Points:
    431
    LOL, and I edited to your post while I was typing............
     
  17. HopelesslyFaithful

    HopelesslyFaithful Notebook Virtuoso

    Reputations:
    1,552
    Messages:
    3,271
    Likes Received:
    164
    Trophy Points:
    0
    i guess.....i still find it odd that HT in TS bench pulls off a near 100% performance boost....very hard to believe.
     
  18. TANWare

    TANWare Just This Side of Senile, I think. Super Moderator

    Reputations:
    2,548
    Messages:
    9,585
    Likes Received:
    4,997
    Trophy Points:
    431
    It is the tweaking of the CPU with the cache latencies, sizes even the "SmartCache" structure. They, Intel, have optimized everything for HT not for running without HT inside the CPU. I am quite sure they could optimize to run without HT and this would improve performance on single threads etc. but this is not where they are going with the CPU's......
     
  19. HTWingNut

    HTWingNut Potato

    Reputations:
    21,580
    Messages:
    35,370
    Likes Received:
    9,877
    Trophy Points:
    931
    This is not complicated. It's CPU utilization period, nothing to do with performance.

    A hyperthreaded CPU is seen as two cores for every one. So:
    1/8 = 12.5%
    2/8 = 25.0%
    3/8 = 37.5%
    4/8 = 50.0%
    5/8 = 62.5%
    6/8 = 75.0%
    7/8 = 87.5%
    8/8 = 100.0%

    Turn off hyperthreading and it's;
    1/4 = 25.0%
    2/4 = 50.0%
    3/4 = 75.0%
    4/4 = 100.0%

    Hyper threading can give 10-15% performance boost at best. Run some raw benchmarks and compare final values.
     
  20. HopelesslyFaithful

    HopelesslyFaithful Notebook Virtuoso

    Reputations:
    1,552
    Messages:
    3,271
    Likes Received:
    164
    Trophy Points:
    0
    you obviously haven't read the thread....or looked at the attached pictures -_-
     
  21. Qing Dao

    Qing Dao Notebook Deity

    Reputations:
    1,600
    Messages:
    1,771
    Likes Received:
    304
    Trophy Points:
    101
    You have to be kidding me. You seem to be the one who is clueless here.
     
  22. HopelesslyFaithful

    HopelesslyFaithful Notebook Virtuoso

    Reputations:
    1,552
    Messages:
    3,271
    Likes Received:
    164
    Trophy Points:
    0
    do you ever stop trolling? It has been shown in Tanwares screen shots that with HT off it is ~50% slower(for 8-16 threads) but 1-4 threads are the same speed....hence why i had a hard time believing there was that much of a difference. So go troll else where. What the hell is your problem anyways?
     
    Qing Dao likes this.
  23. TANWare

    TANWare Just This Side of Senile, I think. Super Moderator

    Reputations:
    2,548
    Messages:
    9,585
    Likes Received:
    4,997
    Trophy Points:
    431
    The thing is for real world HT only yields 10-15% benefit but some synthetics will show it better. IE the TS benchmark of 32m.

    Without HT (disabled in bios) and then with it you get

    1 Thread = 52.296 vs. 52.500
    2 Thread = 26.413 vs. 26.797
    4 Thread = 13.939 vs. 13.907
    8 Thread = 14.553 vs. 7.663
    12 Thread = 13.738 vs. 8.155
    16 Thread = 13.792 vs. 8.159

    So the numbers speak for themselves on this benchmark...........

    Reattach original images

    View attachment 104935 View attachment 104936
     
  24. HTWingNut

    HTWingNut Potato

    Reputations:
    21,580
    Messages:
    35,370
    Likes Received:
    9,877
    Trophy Points:
    931
    Run something like x264 and you will see more of a real world impact. I'll do it later but am busy setting up for a big birthday party this weekend

    Beamed from my G2 Tricorder
     
  25. HopelesslyFaithful

    HopelesslyFaithful Notebook Virtuoso

    Reputations:
    1,552
    Messages:
    3,271
    Likes Received:
    164
    Trophy Points:
    0
    i know real world numbers are not even close to that...hence why i was wondering if HT was butchering 1/2/4 threaded stuff but it isn't....just TS bench is a weird benchmark when it comes to that. Hence why i posted this thread asking.
     
  26. RCB

    RCB Notebook Deity

    Reputations:
    644
    Messages:
    1,065
    Likes Received:
    103
    Trophy Points:
    81
    Glad you did! This is my first HT processor and I had been wondering about some of the finer details involved.
     
  27. TANWare

    TANWare Just This Side of Senile, I think. Super Moderator

    Reputations:
    2,548
    Messages:
    9,585
    Likes Received:
    4,997
    Trophy Points:
    431
    This thread in no way gets to the finer details. It is actually quite involved to say the least. This thread is just to give the gist of what it is and the basics of where it best works etc.. TBH HT works better today than on the original CPU's where it was introduced but not that much better. The introduction of all on die and smart cache help a lot along with more and better cores in which to spread the load. Also Intel has now had a bit of time to develop tweaks and features to best utilize HT. You really can't compare the HT of today to the HT on CPU's from 10 years ago.................
     
  28. RCB

    RCB Notebook Deity

    Reputations:
    644
    Messages:
    1,065
    Likes Received:
    103
    Trophy Points:
    81
    Of course...
    it is more complicated than this thread.
    Half the problem of finding and figuring out something is to know what to look for to begin with. Then you have to trust the source of that information.
    This isn't a topic that I want to spend a ton of time on searching/reading/watching people nuance in too much finer detail.

    The analogies are working pretty good for me.

    :)
     
  29. TANWare

    TANWare Just This Side of Senile, I think. Super Moderator

    Reputations:
    2,548
    Messages:
    9,585
    Likes Received:
    4,997
    Trophy Points:
    431
    You would be amazed at how many people want to call you when things are over simplified. There are purists that just do not realize people want a simple explanation of things and how they work. I get beat up about it all the time but there are lurkers here as well that would like to understand the info.
     
  30. RCB

    RCB Notebook Deity

    Reputations:
    644
    Messages:
    1,065
    Likes Received:
    103
    Trophy Points:
    81
    Last question -

    Is there any practical reason for turning off HT in the bios and running strictly 4 cores?

    For instance would there need to be a specialty software requirement and also that some windows processes turned off to not interrupt that.
     
  31. HopelesslyFaithful

    HopelesslyFaithful Notebook Virtuoso

    Reputations:
    1,552
    Messages:
    3,271
    Likes Received:
    164
    Trophy Points:
    0
    not really....from the one article i can no longer recall where it was the biggest performance hit was like 3% IIRC. In general it either helps or does nothing.
     
  32. tijo

    tijo Sacred Blame

    Reputations:
    7,588
    Messages:
    10,023
    Likes Received:
    1,077
    Trophy Points:
    581
    In the Pentium 4 days, HT was a hindrance, now, not really and in some tasks it does help.
     
  33. ajnindlo

    ajnindlo Notebook Deity

    Reputations:
    265
    Messages:
    1,357
    Likes Received:
    87
    Trophy Points:
    66
    Here are some real world benchmarks. http://www.tomshardware.com/reviews/Intel-Core-i7-Nehalem,2057-12.html

    I posted before, but the wall of text must of hidden it.

    Some things run faster without HT. See my link with benchmarks above. Personally I don't think the difference is worth it. But some unique cases might make it worth it.

    True, but also programmers are changing. Ten years ago I didn't know how to do a multithreaded program. Today I do. Not only that, but programming languages have evolved as well. It is now much easier for me to write multithreaded programs than it would have been ten years ago.

    Still, it seems we have a long way to go...
     
  34. RCB

    RCB Notebook Deity

    Reputations:
    644
    Messages:
    1,065
    Likes Received:
    103
    Trophy Points:
    81
    I went over to look earlier but got derailed. Just went and noticed it was from 2008... have to assume that since then, circa Jan. 2012, things had gained further improvement especially in platform hardware and OS support.
    From what I could gather from a cursory search of web discussions was that windows support also contributes to it working effectively.

    Since I'm not a single purpose machine, and need the medium heavy multi-tasking, I'd surely suffer in some way with HT disabled.
     
  35. TANWare

    TANWare Just This Side of Senile, I think. Super Moderator

    Reputations:
    2,548
    Messages:
    9,585
    Likes Received:
    4,997
    Trophy Points:
    431
    Yeah, saw those number but because of age did not link to them myself. It falls trues as well, especially with HT, that most higher clocked and cheaper dual cores will better serve most users. Now visitors here make take a different view on this but then again we here tend to not be the average user................
     
  36. tijo

    tijo Sacred Blame

    Reputations:
    7,588
    Messages:
    10,023
    Likes Received:
    1,077
    Trophy Points:
    581
    True, but now given the clock speeds of the quads, compared to the clock speeds of the first gen core i, the difference in single threaded clocks between a dual core i5 and a quad i7 isn't that high. Back in the Arrandale/clarksfield days, there was a rather large difference, but now I doubt the small difference will affect most users.

    I've seen a few instances where HT did absolutely nothing by the way, COMSOL, a FEA software I use sometimes stopped using HT after version 4.0 because of that, encoding video in h264 on the other hand benefited from HT.
     
    HopelesslyFaithful likes this.
  37. gdansk

    gdansk Notebook Deity

    Reputations:
    325
    Messages:
    728
    Likes Received:
    42
    Trophy Points:
    41
    I'm sure it's been said but here's how Hyper-Threading works: it duplicates the state of a core but it does not duplicate the execution units (they're big). This means that you have threads A and B sharing the same execution units (they actually do math).

    The operating system is to issue instructions to each execution thread. This allows the execution units to operate on thread B while thread A is waiting for something (let's say a read from memory). Windows thus only sees 12.5% of threading resources occupied even though 25% of execution resources are occupied (though it's probably doing NOPs when waiting for memory). Now, also remember that Windows always has more than one thread and so cycles them through all the available threading resources. It aims to keep each thread on the same "virtual core" as the last time it was executed. But if you have let's say 4 major threads, Windows will send one to each core and use the virtual cores for less busy threads (on an quad-core + HT). HT, essentially, allows cores to be busy more often.
     
    HopelesslyFaithful and RCB like this.
  38. ajnindlo

    ajnindlo Notebook Deity

    Reputations:
    265
    Messages:
    1,357
    Likes Received:
    87
    Trophy Points:
    66
    True, I linked to older benchmarks for HT. Does anyone have newer benchmarks with real world programs?
     
  39. TANWare

    TANWare Just This Side of Senile, I think. Super Moderator

    Reputations:
    2,548
    Messages:
    9,585
    Likes Received:
    4,997
    Trophy Points:
    431
    I do not see many modern benchmarks with this info. It actually would probably be nice for some site to revisit this issue. That is to see if HT is still worth it or has it become even more relevant. Now if it were an issue looked at just a year ago with the last gen CPU's I would say no but 5 years and at least three ticks of CPU's it may just be time.