Date: Tue, 08 May 2007 08:55:08 +0200 From: Terje Mathisen <terje.mathisen@hda.hydro.com> Newsgroups: comp.arch Subject: Re: ISA quickie survey: min/max instructions? Message-ID: <1gj5h4-9mg.ln1@osl016lin.hda.hydro.com> Andrew Reilly wrote: > Hi all, > > I was just wondering about the "conditional-ish" arithmetic operations: > abs, min, max, and maybe others. Most of these are available in SIMD > instruction sets, but I haven't tracked down many of them in scalar > instruction sets. They could be handy for avoiding branches. Is > subtract+conditional move the accepted best way to handle these? Might > min and max show up (both for integer and floating point) in future scalar > instruction sets? In particular, min and max would seem to fit the > two-source-one-destination pattern, where conditional move doesn't, quite. Having max/min integer opcodes will only help with that particular idiom, while CMOVcc can be used in a lot of different situations. On a modern, relatively deep, pipeline with good branch prediction, the fastest way to handle min()/max() can often turn out to be the naive CMP/JA/MOV sequence, as long as the branch pattern predicts well: ; Return MAX(eax, ebx) cmp eax,ebx ja done mov eax,ebx done: If EAX is the largest value and the branch is correctly predicted, then the code above will have a latency of a single cycle! With EBX the larger and correct prediction it takes two cycles. In both cases a branch mispredict causes a 5-20 cycle hit. The conditional version is cmp eax,ebx cmovb eax,ebx This has a fixed latency of three cycles, so if the branch predictors above hits 90+% or better, the average cost of the if () {} version might actually be less than the 3-cycle CMOV-based code. Two caveats: a) Sometimes the worst case time is just as important as the average or best case, then you should definitely use CMOV. This is also important for crypto code where you definitely want the execution time to be constant/independent of the key. b) The branch predictor is a very limited resource (i.e. 128 entries on a brand-new Core 2 Duo), so using it for a lot of min/max operations might blow away some other more useful entry. Terje -- - <Terje.Mathisen@hda.hydro.com> "almost all programming can be viewed as an exercise in caching"

Index Home About Blog