Core i7 logic | NotebookReview

fred2028 Sexy member

Reputations:: 196

Messages:: 2,205

Likes Received:: 1

Trophy Points:: 56

Just curious as to how the Core i7s handle their stuff. Take the 820QM for example. 4 physical + 4 virtual cores, so 8 threads total.

When a dual-threaded app comes along, does it use 1 physical + 1 virtual core or 2 physical cores and 0 virtual cores?

When turbo boost turns off certain cores, does it

Turn off all cores except the one(s) that have/has a thread being worke on

Redirect all thread(s) to a specific core and then turn off all other cores?

fred2028, Dec 14, 2009

#1

funky monk Notebook Deity

Reputations:: 233

Messages:: 1,485

Likes Received:: 1

Trophy Points:: 55

I think that if a multi threaded application came along, it would probably use two physical cores but put the clock speed down. That way it would use less power since power draw increases exponentially with clock speed.

Again, for turboboost I would think that it would try to spread the load evenly between as many cores as possible to reduce power consumption. This way it also reduces heat, producing faster processors as they can be reliably clocked higher without worry of overheating

funky monk, Dec 14, 2009

#2

notyou Notebook Deity

Reputations:: 652

Messages:: 1,562

Likes Received:: 0

Trophy Points:: 55

fred2028 said: ↑

Just curious as to how the Core i7s handle their stuff. Take the 820QM for example. 4 physical + 4 virtual cores, so 8 threads total.

When a dual-threaded app comes along, does it use 1 physical + 1 virtual core or 2 physical cores and 0 virtual cores?

When turbo boost turns off certain cores, does it

Turn off all cores except the one(s) that have/has a thread being worke on

Redirect all thread(s) to a specific core and then turn off all other cores?

Click to expand...

funky monk said: ↑

I think that if a multi threaded application came along, it would probably use two physical cores but put the clock speed down. That way it would use less power since power draw increases exponentially with clock speed.

Again, for turboboost I would think that it would try to spread the load evenly between as many cores as possible to reduce power consumption. This way it also reduces heat, producing faster processors as they can be reliably clocked higher without worry of overheating

Click to expand...

I'm not sure how the i7's handle the thread pinning. I did however do a paper on the Nehalem architecture just a few weeks ago so I'll share what I've gotten from that.

1) As far as I'm aware, Nehalem will power down all cores under a certain threshold.
2) @Funky - I believe you have it backwards. Power draw increases more with voltage than with clock speed. Though they tend to go hand in hand since more voltage is often required for higher clock speeds. Note, take a look at undervolting and why it makes such a big difference.
3) The thing about turbo boost is that it works best when only a single core is working. In the papers I studied, having two cores loaded meant that the clock speed gains from TB were smaller than a single core and thus, if you're able to just have one core fully loaded, the application will perform fastest that way (note that this is only for single-threaded applications).

notyou, Dec 15, 2009

#3

Judicator Judged and found wanting.

Reputations:: 1,098

Messages:: 2,594

Likes Received:: 19

Trophy Points:: 56

From a conversation we were having in another thread ( http://forum.notebookreview.com/showthread.php?t=440539), it apparently would depend on your OS. Windows 7 server version would go physical core + logical core, while non server versions of Windows 7 would prefer to go with physical cores.

Judicator, Dec 15, 2009

#4

Generic User #2 Notebook Deity

Reputations:: 179

Messages:: 846

Likes Received:: 0

Trophy Points:: 30

from what I've read. it is VERY dependent on what is actually being processed and how well the OS recognizes HT-processors.

for something 'heavy' like a dual-threaded video encoder(bear with me ), it would go straight for two physical cores. it would also use a HT-thread to run OS processes.

however, if the system was in idle, it would run OS processes on a real-core and then move those processes onto a ht-thread when a more CPU-hungry application is run.

basically, it will try to use HT-threads when the OS thinks that the HT-thread has enough power for the process, otherwise, it will go for a real thread.

Generic User #2, Dec 15, 2009

#5

thinkpad knows best Notebook Deity

Reputations:: 108

Messages:: 1,140

Likes Received:: 0

Trophy Points:: 55

That is a good idea, since virtual "cores" usually perform lesser than that of physical cores.

thinkpad knows best, Dec 15, 2009

#6

BrandonSi Notebook Savant

Reputations:: 571

Messages:: 1,444

Likes Received:: 0

Trophy Points:: 55

fred2028 said: ↑

Just curious as to how the Core i7s handle their stuff. Take the 820QM for example. 4 physical + 4 virtual cores, so 8 threads total.

Click to expand...

Not exactly right, but I see where you're going.

You have 4 physical cores, capable of handling two threads each. The difference is when you refer to "virtual cores".. Using that concept, you could have 4 virtual cores and 4 physical cores maxed out at 100% cpu utilization. That's not how it works.. What you actually have is 4 cores, handling 8 threads.. If the 4 cores are running at 100% cpu utilization, you have 8 threads, and each thread using 50% (or 60/40, 70/30, etc..) of the cpu time.

Similar ideas, but very different implementation.

Also, the application has the final say over how the threads are utilized, not the OS. That's why a single threaded app can make your Windows super-slow and almost unresponsive at 100% utilization. If it was up to the OS, that wouldn't happen, but the OS handles the requests of the app, not vice-versa.

That's why you also read a lot about thread-optimization and "properly threaded" apps.

Edit - I should add, that what I was referring to was default behavior. You can, of course, tell Windows to only use specific cpu's/cores for an application. You, as the user also have final control over the threading, as you can turn HT off in the bios.

BrandonSi, Dec 15, 2009

#7

Judicator Judged and found wanting.

Reputations:: 1,098

Messages:: 2,594

Likes Received:: 19

Trophy Points:: 56

Well, the point is that to the OS, it "looks" like 8 cores. It wasn't until windows 7 that windows could even tell the difference between a physical and logical core. In terms of actual engineering, yes, you're right in that it's really only 4 physical cores that can handle up to 2 threads (relatively) simultaneously. We're not quite at the point where we can create processors out of thin air yet (although I'm sure _someone's_ trying!).

And I would clarify that it's not so much that the application has the final say over how the threads are utilized, so much as the application defines what's in a thread, and how many threads are presented to the OS. That single threaded app that makes your Windows super-slow and almost unresponsive is an example of an application that presents the OS with only one thread stuffed full of everything that needs to be done, as opposed to a more intelligently progammed application (in this era of multi-core processors) that splits up that thread into multiple threads to share the load.

Judicator, Dec 16, 2009

#8

BrandonSi Notebook Savant

Reputations:: 571

Messages:: 1,444

Likes Received:: 0

Trophy Points:: 55

Judicator said: ↑

it's really only 4 physical cores that can handle up to 2 threads (relatively) simultaneously.

Click to expand...

Nice catch. You're correct, it's not actually simultaneous execution.

BrandonSi, Dec 16, 2009

#9

davepermen Notebook Nobel Laureate

Reputations:: 2,972

Messages:: 7,788

Likes Received:: 0

Trophy Points:: 205

actually, it is simultaneous execution.. if enough free processing units are available on the core.

davepermen, Dec 16, 2009

#10

BrandonSi Notebook Savant

Reputations:: 571

Messages:: 1,444

Likes Received:: 0

Trophy Points:: 55

davepermen said: ↑

actually, it is simultaneous execution.. if enough free processing units are available on the core.

Click to expand...

That's not how I understand it, do you have a source for that? It has been my understanding since the P4 times that HT was actually temporal multi-threading, meaning the two threads share the single core resources, and as such cannot execute instructions simultaneously. Granted, we're taking microseconds, but they must take turns, as there is only one CPU/core and associated resources, for the two threads.

BrandonSi, Dec 16, 2009

#11

weinter /dev/null

Reputations:: 596

Messages:: 2,798

Likes Received:: 1

Trophy Points:: 56

BrandonSi said: ↑

That's not how I understand it, do you have a source for that? It has been my understanding since the P4 times that HT was actually temporal multi-threading, meaning the two threads share the single core resources, and as such cannot execute instructions simultaneously. Granted, we're taking microseconds, but they must take turns, as there is only one CPU/core and associated resources, for the two threads.

Click to expand...

I remember I read somewhere for HT they actually have 2 set of registers to hold 2 states of running thread and 1 execution core that switches constantly between them.
Ahh Wikipedia agrees with me

Hyper-threading works by duplicating certain sections of the processor—those that store the architectural state—but not duplicating the main execution resources.

Click to expand...

HT doesn't do simultaneous execution since there is only 1 execution core however it switches so fast and threads have wait states.

weinter, Dec 16, 2009

#12

BrandonSi Notebook Savant

Reputations:: 571

Messages:: 1,444

Likes Received:: 0

Trophy Points:: 55

Thanks weinter! That's what I thought.. I think for most purposes saying "simultaneous" is OK, but technically it's not, since there's a single point of execution. Good find!

BrandonSi, Dec 16, 2009

#13

Tinderbox (UK) BAKED BEAN KING

Reputations:: 4,740

Messages:: 8,513

Likes Received:: 3,823

Trophy Points:: 431

Anybody know how to test the i7 turbo mode the 720 is supposed to be able to have 1 core at 2.8ghz

EDIT : I was just watching the core frequency with cpu-z and saw core 0 go to just under 2.8ghz for a split second.

Tinderbox (UK), Dec 16, 2009

#14

Serg Nowhere - Everywhere

Reputations:: 1,980

Messages:: 5,331

Likes Received:: 1

Trophy Points:: 206

I will try and explain what I know on this.

I have a 720QM and so far this is how it goes.

When stressing ONE core the load will jump between cores...it is a rather strange behaviour, but it works so no complaints.

The cores have somewhat priority, but the core 0.1 (my naming for the first virtual core) seems to have priority over the core 1.0 and 1.1.

Just checking my task manager, the cores are:
Core 0 is 4%
Core 0.1 is 0%
Core 1 is 0%
Core 1.1 is 0%
Core 2 is 8%
Core 2.1 is 3%
Core 3 is 0%
Core 3.1 is 3%

Serg, Dec 16, 2009

#15

Generic User #2 Notebook Deity

Reputations:: 179

Messages:: 846

Likes Received:: 0

Trophy Points:: 30

BrandonSi said: ↑

That's not how I understand it, do you have a source for that? It has been my understanding since the P4 times that HT was actually temporal multi-threading, meaning the two threads share the single core resources, and as such cannot execute instructions simultaneously. Granted, we're taking microseconds, but they must take turns, as there is only one CPU/core and associated resources, for the two threads.

Click to expand...

thats how NON-HT processors work.

the execution units are constantly swapping data in and out of the pipeline and the caches. this is context switching between threads.

with HT processors though, there is no need to swap out the caches, the 'new' info it needs is ALREADY in the second set of caches. meaning that memory latency is much shorter.

theres alot more technical details than that, but i'm pretty sure thats an acceptable working model.

Generic User #2, Dec 16, 2009

#16

notyou Notebook Deity

Reputations:: 652

Messages:: 1,562

Likes Received:: 0

Trophy Points:: 55

weinter said: ↑

I remember I read somewhere for HT they actually have 2 set of registers to hold 2 states of running thread and 1 execution core that switches constantly between them.
Ahh Wikipedia agrees with me
HT doesn't do simultaneous execution since there is only 1 execution core however it switches so fast and threads have wait states.

Click to expand...

I believe you're thinking of some type of multi-threading, which is having multiple threads and constantly switching between them, but not executing multiple threads at once.

BrandonSi said: ↑

Thanks weinter! That's what I thought.. I think for most purposes saying "simultaneous" is OK, but technically it's not, since there's a single point of execution. Good find!

Click to expand...

Generic User #2 said: ↑

thats how NON-HT processors work.

the execution units are constantly swapping data in and out of the pipeline and the caches. this is context switching between threads.

with HT processors though, there is no need to swap out the caches, the 'new' info it needs is ALREADY in the second set of caches. meaning that memory latency is much shorter.

theres alot more technical details than that, but i'm pretty sure thats an acceptable working model.

Click to expand...

Generic has it right. Simultaneous Multi-Threading means you can execute two threads at once on a single core. And what everyone else is thinking is that the execution core is a single unit, instead, it's actually a couple of functional units (add, subtract, multiply, divide, etc.) which allow it to execute the multiple threads in parallel.

notyou, Dec 16, 2009

#17

Serg Nowhere - Everywhere

Reputations:: 1,980

Messages:: 5,331

Likes Received:: 1

Trophy Points:: 206

Well, the theory is that you have a two lane road. The actual core can get both lanes of information, and that is why you see two cores, see it this way, each lane you have is a "core" ( which in reality is a thread, but let's use the core word sincr markerting does). Each one of these "cores" has the ability to do one thing at the time, so you get in the case of the 720QM a total of eight "cores" working.

The logic behind it is that you use the least physical cores as possible while doing as much things as possible. The physical core has priority when doing a single task, when launching a second task, if the first one has some overhead, it will have priority and lower the other 3+3 unless they are needed. When you launch a thitd thread the next physical core will come into play and the virtual accompaigning this core gets ready for usage, so the first 2 threads on the core 0 and 0.1 get a speed cut sincer they have to share. And so on until you cope all eight cores.

Now enough theory, in real life it is somewhat different, when stressing one core is not THAT single core the onlyt one, all 8 are active, but trhe other seven are on a very low power consumption mode. Why? My guess is that one core cannot handle to much time under full 2.8GHz for too long, since when you check the processor with the task manager or TMonitor or CPU-Z you will see how the TB junps around, and the processes go from 0 to 1 to 2 to 3 and their repesctive virtual cores. I have yet to fully test mmine, but on regular tasks the ht has a priority over the phisical when one thread is already running on the physical core.

And that is why I see the loads jump all over the CPU...

Serg, Dec 17, 2009

#18

IntelUser Notebook Deity

Reputations:: 364

Messages:: 1,642

Likes Received:: 75

Trophy Points:: 66

Serg, the reason Windows jumps threads around is because way back in the single core days, it helped improve multi-threading performance.

That's not how I understand it, do you have a source for that? It has been my understanding since the P4 times that HT was actually temporal multi-threading

Click to expand...

The Pentium 4 uses Simultaneous Multi-Threading. The only Intel processor that uses other type of multi-threading is in the dual core Itanium code name Montecito and Montvale. THAT is called TMT, or temporal multi-threading.
(Actually the more used term for TMT is SoEMT)

SMT is explicitely to increase utilization of otherwise idle execution units. Therefore two threads can process simultaneously.

Anybody know how to test the i7 turbo mode the 720 is supposed to be able to have 1 core at 2.8ghz

EDIT : I was just watching the core frequency with cpu-z and saw core 0 go to just under 2.8ghz for a split second.

Click to expand...

There is an app from intel that allows you to observe Turbo Mode. If you have Windows 7/Vista, download the gadget called Turbo Boost Monitor.

http://downloadcenter.intel.com/Detail_Desc.aspx?agr=Y&DwnldID=18353&lang=eng

IntelUser, Dec 17, 2009

#19

Judicator Judged and found wanting.

Reputations:: 1,098

Messages:: 2,594

Likes Received:: 19

Trophy Points:: 56

IntelUser said: ↑

The Pentium 4 uses Simultaneous Multi-Threading. The only Intel processor that uses other type of multi-threading is in the dual core Itanium code name Montecito and Montvale. THAT is called TMT, or temporal multi-threading.
(Actually the more used term for TMT is SoEMT)

SMT is explicitely to increase utilization of otherwise idle execution units. Therefore two threads can process simultaneously.

Click to expand...

Well, you're both right, depending on from which position you're looking at it. From the OS side, as far as it's concerned, yes, both threads are processing simultaneously. When you get down to the actual processing core itself, it can't actually process both threads at the same time. It'll process one, but whenever it stalls or is idle in one thread, it'll process the other. The difference largely lies in where the thread "swapping", shall we say, is implemented. In a non hyperthreaded/SMT enabled processor this is handled outside the CPU, on the OS level. In an i7, it's handled down at the hardware level, which is why to the OS, it looks like 2 simultaneously processing cores. Basically, as IntelUser said, it's a way to improve the efficiency of the execution units to keep them "occupied" more often. Essentially, it's an efficiency solution that shows up best in heavily multi-threaded/multi-tasking situations, where you have several threads running essentially simultaneously. When you get down to the real nitty gritty, however, it's just a more efficient way to share resources, which is why you only get somewhere around a 10-15% advantage (Well, Intel claims 30%) from hyperthreading in most heavily multi-threaded situations (a CPU with hyperthreading enabled (in theory doubling the threads it can process) will operate approximately 30% faster than the same CPU running the same tasks with hyperthreading turned off).

Judicator, Dec 17, 2009

#20

Explosivpotato Notebook Consultant

Reputations:: 15

Messages:: 296

Likes Received:: 0

Trophy Points:: 30

Judicator said: ↑

Well, you're both right, depending on from which position you're looking at it. From the OS side, as far as it's concerned, yes, both threads are processing simultaneously. When you get down to the actual processing core itself, it can't actually process both threads at the same time. It'll process one, but whenever it stalls or is idle in one thread, it'll process the other. The difference largely lies in where the thread "swapping", shall we say, is implemented. In a non hyperthreaded/SMT enabled processor this is handled outside the CPU, on the OS level. In an i7, it's handled down at the hardware level, which is why to the OS, it looks like 2 simultaneously processing cores. Basically, as IntelUser said, it's a way to improve the efficiency of the execution units to keep them "occupied" more often. Essentially, it's an efficiency solution that shows up best in heavily multi-threaded/multi-tasking situations, where you have several threads running essentially simultaneously. When you get down to the real nitty gritty, however, it's just a more efficient way to share resources, which is why you only get somewhere around a 10-15% advantage (Well, Intel claims 30%) from hyperthreading in most heavily multi-threaded situations (a CPU with hyperthreading enabled (in theory doubling the threads it can process) will operate approximately 30% faster than the same CPU running the same tasks with hyperthreading turned off).

Click to expand...

This is, for the most part, how I always understood HT. Jumping from one thread to another while the first is idle certainly improves multitasking.

I also understood (as some previous posters already stated), that it allowed simultaneous processing of 2 dissimilar threads on a single core, but only when one thread leaves an execution portion of the core idle that another thread could be using.

Explosivpotato, Dec 17, 2009

#21

davepermen Notebook Nobel Laureate

Reputations:: 2,972

Messages:: 7,788

Likes Received:: 0

Trophy Points:: 205

BrandonSi said: ↑

That's not how I understand it, do you have a source for that? It has been my understanding since the P4 times that HT was actually temporal multi-threading, meaning the two threads share the single core resources, and as such cannot execute instructions simultaneously. Granted, we're taking microseconds, but they must take turns, as there is only one CPU/core and associated resources, for the two threads.

Click to expand...

if one hyperthread is doing floatingpoint computations, it will schedule it's tasks to the floatingpoint unit. if the other hyperthread is doing integer computations, it tasks that to the integer units.

the result will be two computations happen in parallel, for two independent threads.

as far as i know, it does NOT interleave if the workloads are completely independent, and don't require some parts of the cpu that only exist once.

that's how i learned hyperthreading: two (or more) schedulers scheduling jobs onto the single core, that has still the same amount of computational units. so it can't do twice the inteter, or sse, or floatingpoint work (which a dualcore could). BUT it can do two different things at the same time, putting the cpu to better usage.

of course, the main feature of hyperthreading is to hide memory latencies. when ever on thread waits for some data, the other can use the cpu at it's fullest.

davepermen, Dec 17, 2009

#22

Serg Nowhere - Everywhere

Reputations:: 1,980

Messages:: 5,331

Likes Received:: 1

Trophy Points:: 206

Well i7 architecture is a share-the-hardware-between-two-cores thing if you want to see it that way.

This means the hardware is one, but the core can do two things.

Launch thread one. Core 0 processes thread one. Launch thread two. Core 1 could process it, but why bother, Core 0 can handle more. Thread 2 goes to core 0, and while thread one is not being processed and is in idle waiting for response from the peripherals or software, the core 0 can get to work on thread 2. This is interpreted as another core by the computer. But it is only one core. So thread 1 has a response, and now thread 2 needs one. So core 0 swaps between thread 1 and thread 2. Thread 1 is being processed while thread 2 is on stand by for response. And so on.

When you launch a thread 3, it cannot go into the stand by mode on core 0, since core 0 is already working on two threads. So thread 3 goes to the next core. Process repeats for thread 4.

If you launch a thread 5 and thread 1 is done, thread 5 will go to occupy thread 1 place, and work there, not to bother the core 2 and 3 since they are not needed.

Serg, Dec 17, 2009

#23

BrandonSi Notebook Savant

Reputations:: 571

Messages:: 1,444

Likes Received:: 0

Trophy Points:: 55

Wow, this got lively! Good discussion everyone! I just want to put a couple of things out there (as I understand them).

1 - Windows is an SMT aware OS. That is why with no HT enabled, it can process parallel instructions (since we have multiple physical cores).

2 - Intel takes advantage of Windows being SMT aware for HT, and advertises HT as additional cores, this is why we see 8 CPU's for Quad-core HT. The OS treats these 'virtual' cores as actual cores, it is on the hardware end where the actual 'division of labor' happens.

3 - HT as it applies to a single core, is TMT. A single core cannot process two instructions at once (unless you're using specific cases, like Judicator mentioned, with FPU / integer operations). The processor will switch between threads to execute instructions, but most of the time, it's not actually simultaneous.

4 - If we have two threads (thread 1 and thread 2) assigned to core 0, and core 0 is busy executing thread 1, thread 2 (assuming the application is optimally coded) can be executed by another available (non-busy) core. In that sense, HT can be parallel. However, as it might apply to the P4 (or any single core solution), this isn't possible, as there's only one core.

I think that sums up the main points everyone was trying to make.. agree? Did I miss something, or incorrectly word anything?

BrandonSi, Dec 17, 2009

#24

Judicator Judged and found wanting.

Reputations:: 1,098

Messages:: 2,594

Likes Received:: 19

Trophy Points:: 56

3) I wasn't the one that mentioned specific cases (it was more notyou and daverpermen), but at that point I think a lot depends on exactly how many functional processing units are in each "processor core", and what the thread that's attempting to run currently actually needs out of those resources. That, of course, is highly dependent on the actual architecture of said core as well as the thread itself.

4) I think that decision is up to the OS, not the processor. The OS assigns threads to cores, and the cores then run said threads. This was a big part of why Windows 7 was supposed to be so important for i7 with it's SMT parking; it'll deliberately try to assign tasks to separate physical cores before using hyperthreading to put 2 threads onto one core. Vista and XP, from what I can tell, can't tell the difference between a physical and logical core, and thus are just as likely to put the 2 most taxing threads on a single physical core, and thus slow things down overall. Note that Windows 7 server apparently goes the other way, and thanks to Core parking, will schedule things on as few (physical) cores as possible, to save power.

Judicator, Dec 17, 2009

#25

BrandonSi Notebook Savant

Reputations:: 571

Messages:: 1,444

Likes Received:: 0

Trophy Points:: 55

Interesting, thanks Judicator. Apologies if I mixed up who said what.

Just to clarify, "Windows 7 Server" = Server 2008, correct?

BrandonSi, Dec 17, 2009

#26

f4ding Laptop Owner

Reputations:: 261

Messages:: 2,085

Likes Received:: 0

Trophy Points:: 55

davepermen got it right. Each physical core can handle two threads AT ONCE. Each physical core has smaller units, like floating point unit (FPU), integer unit (IU), and so on. If thread 1 requires FPU only and not IU, so that core 1 IU is idle, then core 1 can handle another thread 2 IF thread 2 only requires IU but not FPU. That's the whole concept of SMT, and the reason why some programs will see advantage (usually scientific program) while some don't in using SMT-capable CPU.

I saw a paper somewhere by Intel or a graduate student that analyzed the new SMT in the i7 and how it is now more efficient than the old P4-HT. And in fact, Intel is not the first one to SMT. IBM and SUN are already talking about 8-logical core per CPU (or is it 80?). Although theirs are different architechtures altogether.

f4ding, Dec 17, 2009

#27

f4ding Laptop Owner

Reputations:: 261

Messages:: 2,085

Likes Received:: 0

Trophy Points:: 55

BrandonSi said: ↑

Interesting, thanks Judicator. Apologies if I mixed up who said what.

Just to clarify, "Windows 7 Server" = Server 2008, correct?

Click to expand...

No, the server version of Windows 7 is Windows Server 2008 R2, which I am using.

f4ding, Dec 17, 2009

#28

f4ding Laptop Owner

Reputations:: 261

Messages:: 2,085

Likes Received:: 0

Trophy Points:: 55

Judicator said: ↑

4) I think that decision is up to the OS, not the processor. The OS assigns threads to cores, and the cores then run said threads. This was a big part of why Windows 7 was supposed to be so important for i7 with it's SMT parking; it'll deliberately try to assign tasks to separate physical cores before using hyperthreading to put 2 threads onto one core. Vista and XP, from what I can tell, can't tell the difference between a physical and logical core, and thus are just as likely to put the 2 most taxing threads on a single physical core, and thus slow things down overall. Note that Windows 7 server apparently goes the other way, and thanks to Core parking, will schedule things on as few (physical) cores as possible, to save power.

Click to expand...

The decision is not up to the OS. The logical cores are transparent to the OS. The OS only need to realize that the CPU is SMT capable, but it does not decide which thread is assigned to which CPU. The OS scheduler might need some modification so that it will give priority to physical cores over logical cores.

f4ding, Dec 17, 2009

#29

davepermen Notebook Nobel Laureate

Reputations:: 2,972

Messages:: 7,788

Likes Received:: 0

Trophy Points:: 205

no. the os knows to which physical and virtual core it assigns the thread to, and priories according to workload.

davepermen, Dec 17, 2009

#30

Judicator Judged and found wanting.

Reputations:: 1,098

Messages:: 2,594

Likes Received:: 19

Trophy Points:: 56

f4ding said: ↑

The decision is not up to the OS. The logical cores are transparent to the OS. The OS only need to realize that the CPU is SMT capable, but it does not decide which thread is assigned to which CPU. The OS scheduler might need some modification so that it will give priority to physical cores over logical cores.

Click to expand...

That seems contradictory? It sounds like you're saying the OS scheduler is what assigns threads to cores, since you're saying that it needs to be modified to give priority to physical cores over logical cores, but then isn't the OS scheduler part of the OS? And if the logical cores are transparent to the OS and thus presumably the OS scheduler, then how could the OS scheduler give priority to physical cores over logical cores it couldn't even see?

Judicator, Dec 17, 2009

#31

f4ding Laptop Owner

Reputations:: 261

Messages:: 2,085

Likes Received:: 0

Trophy Points:: 55

Judicator said: ↑

That seems contradictory? It sounds like you're saying the OS scheduler is what assigns threads to cores, since you're saying that it needs to be modified to give priority to physical cores over logical cores, but then isn't the OS scheduler part of the OS? And if the logical cores are transparent to the OS and thus presumably the OS scheduler, then how could the OS scheduler give priority to physical cores over logical cores it couldn't even see?

Click to expand...

The old OS scheduler could not do this. It simply assign threads to all cores, physical or logical. This affect performance. That's you read news with Intel working with Microsoft to help with improving SMT-capable CPU performance. The new OS scheduler realizes and can diffirentiate between physical and logical cores (or at least it should, that's the plan or the logical progression). But in the end, which cores (logical or physical) each threads are going to does not matter to the OS. The OS only passes the thread to the CPU as far as the OS is concern.

f4ding, Dec 17, 2009

#32

Serg Nowhere - Everywhere

Reputations:: 1,980

Messages:: 5,331

Likes Received:: 1

Trophy Points:: 206

AFAIK the OS assigns the threads to the cores.

As a little of topic, ran a little test using CATIA V5 R19 on my 720QM. The 4 physical cores got a small load shared between, and the 4 virtual cores saw little to no action, or a very small quantity compared to their physical brothers.

Same will testing Microsoft ISE. I have yet to install CS4 on my laptop and test it.

Serg, Dec 17, 2009

#33

Judicator Judged and found wanting.

Reputations:: 1,098

Messages:: 2,594

Likes Received:: 19

Trophy Points:: 56

f4ding said: ↑

The old OS scheduler could not do this. It simply assign threads to all cores, physical or logical. This affect performance. That's you read news with Intel working with Microsoft to help with improving SMT-capable CPU performance. The new OS scheduler realizes and can diffirentiate between physical and logical cores (or at least it should, that's the plan or the logical progression). But in the end, which cores (logical or physical) each threads are going to does not matter to the OS. The OS only passes the thread to the CPU as far as the OS is concern.

Click to expand...

That's the whole thing. If the OS doesn't care which core the thread is going to, there's no reason to modify the OS scheduler (part of the OS!) to differentiate between physical and logical cores. The fact that the OS scheduler was modified to differentiate between physical and logical cores implies to me that that is what assigns threads to cores (otherwise there'd be no reason for it to know the difference), and thus the decision of which core gets which thread falls back into the province of the OS. Not to mention, of course, the option of setting core affinity, which also reinforces the idea that thread/core selection is the province of the OS.

Judicator, Dec 17, 2009

#34