Are there situations in which SMT negatively impacts performance? What if we have two software threads from two separate applications, each of which has a high ratio of memory loads/stores to arithmetic. And suppose the kernel schedules these two threads to run on the two hardware threads of the same processor. Then we don't get much latency-hiding advantage (since arithmetic operations are infrequent), and additionally each thread is filling the cache with its data, leaving less cache room for the other application's data. Is the worst-case scenario for SMT something like this?
Are there situations in which SMT negatively impacts performance? What if we have two software threads from two separate applications, each of which has a high ratio of memory loads/stores to arithmetic. And suppose the kernel schedules these two threads to run on the two hardware threads of the same processor. Then we don't get much latency-hiding advantage (since arithmetic operations are infrequent), and additionally each thread is filling the cache with its data, leaving less cache room for the other application's data. Is the worst-case scenario for SMT something like this?