# PureBasic Forum

 It is currently Tue Jan 19, 2021 6:49 pm

 All times are UTC + 1 hour

 Page 1 of 1 [ 14 posts ]
 Print view Previous topic | Next topic
Author Message
 Post subject: AVX - VMOVUPS returns 0Posted: Sun Jun 28, 2015 9:52 pm
 Enthusiast

Joined: Tue May 28, 2013 10:51 pm
Posts: 540
Location: Europe
Greetings to all,

here's a small code snippet:

Code:
EnableExplicit
Define.i i
Dim FloatArray.f(7) : Dim FloatArray2.f(7)
For i = 0 To 7
FloatArray(i) = i + 1
FloatArray2(i) = i + 2
Next
Define *ptr, *ptr2, *ptr_result = AllocateMemory(8*SizeOf(Float))
*ptr = @FloatArray()
*ptr2 = @FloatArray2()
!vmovups ymm0, yword[p_ptr]
!vmovups ymm1, yword[p_ptr2]
!vmulps ymm0, ymm0, ymm1
!vmovups yword[p_ptr_result], ymm0

Basically, I'd like to multiply two float arrays, member per member, which VMULPS should do.

However, last line simply sets *ptr_result to 0, so I'm obviously doing something wrong. Can anyone assist?

TIA!

_________________
"If you lie to the compiler, it will get its revenge."
Henry Spencer
https://www.pci-z.com/

Top

 Post subject: Re: AVX - VMOVUPS returns 0Posted: Mon Jun 29, 2015 7:24 am
 Enthusiast

Joined: Wed Apr 12, 2006 7:59 pm
Posts: 174
Location: Germany
You overwrite *ptr_result with the result (check your code!). You must use register for addressing:
Code:
EnableExplicit
Define.i i
Dim FloatArray.f(7) : Dim FloatArray2.f(7)
For i = 0 To 7
FloatArray(i) = i + 1
FloatArray2(i) = i + 2
Next
Define *ptr, *ptr2, *ptr_result = AllocateMemory(8*SizeOf(Float))
*ptr = @FloatArray()
*ptr2 = @FloatArray2()
!vmovups ymm0, yword[p_ptr]
!vmovups ymm1, yword[p_ptr2]
!vmulps ymm0, ymm0, ymm1
;!vmovups yword[p_ptr_result], ymm0
!mov rax,[p_ptr_result]
!vmovups yword[rax], ymm0

Top

 Post subject: Re: AVX - VMOVUPS returns 0Posted: Mon Jun 29, 2015 7:32 am
 Enthusiast

Joined: Tue May 28, 2013 10:51 pm
Posts: 540
Location: Europe
Helle wrote:
You overwrite *ptr_result with the result (check your code!). You must use register for addressing:
Hi Helle,

sorry for asking stupid questions, I'm still grasping ASM a bit; !mov rax,[p_ptr_result] loads pointer to RAX and then you use VMOVUPS to load result from YMM0 to RAX again? Why exactly?

In the end, Debug PeekF(*ptr_result) produces 0, while it should return "2.0".

_________________
"If you lie to the compiler, it will get its revenge."
Henry Spencer
https://www.pci-z.com/

Top

 Post subject: Re: AVX - VMOVUPS returns 0Posted: Mon Jun 29, 2015 7:59 pm
 Enthusiast

Joined: Wed Apr 12, 2006 7:59 pm
Posts: 174
Location: Germany
vmovups [rax],ymm0 load the 32 Bytes from ymm0 into the memory-address with the value from rax, not in rax (is a 8-Byte-Register!).
Check:
Code:
Dim FloatArray1.f(7) : Dim FloatArray2.f(7)
Define.f j = 0.0, k = 0.0
Define.i i, ptr1 = @FloatArray1(), ptr2 = @FloatArray2(), result = AllocateMemory(8 * SizeOf(Float))

For i = 0 To 7
j + 1.0
FloatArray1(i) = j
k + 2.0
FloatArray2(i) = k
Next

!mov rdx,[v_ptr1]            ;for 64-Bit (rdx)
!vmovups ymm0,[rdx]
!mov rdx,[v_ptr2]
!vmovups ymm1,[rdx]
!vmulps ymm0,ymm0, ymm1
!mov rax,[v_result]
!vmovups [rax],ymm0

For i = 0 To 7
Debug PeekF(result + i * SizeOf(Float))
Next

Top

 Post subject: Re: AVX - VMOVUPS returns 0Posted: Tue Jun 30, 2015 8:14 pm
 Enthusiast

Joined: Tue May 28, 2013 10:51 pm
Posts: 540
Location: Europe
Helle wrote:
vmovups [rax],ymm0 load the 32 Bytes from ymm0 into the memory-address with the value from rax, not in rax (is a 8-Byte-Register!).
Thanks, I've started grasping it slowly.

Perhaps you could shed some light on the following, I hope my question will make sense. Is there explicit way to operate on YMMx register on particular core, or this is something OS should handle? In other words, can this ASM code be run in 4 threads on 4 core CPU and expect linear increase of performance?

With my best,

Bruno

_________________
"If you lie to the compiler, it will get its revenge."
Henry Spencer
https://www.pci-z.com/

Top

 Post subject: Re: AVX - VMOVUPS returns 0Posted: Wed Jul 01, 2015 8:34 am
 Enthusiast

Joined: Wed Apr 12, 2006 7:59 pm
Posts: 174
Location: Germany
For this short code is multi-threading nonsense. But take a look at this (an old test from me, not perfect ):
Code:
;- Apfelmännchen-Test mit AVX 256-Bit, Intel-CPU
;- "Helle" Klaus Helbing, 30.03.2013, PB v.5.10 (x64)

Global Freq.q                          ;für Zeitmessung
Global Start.q
Global Ende.q
Global Zeit.d
Global ZeitG.d

Global Zoom.d = 1.0
Global AchseX.d = 1.25
Global AchseY.d = -2.0
Global ScreenX.q = 1024
Global ScreenY.q = 1023                ;wegen Y-Koordinate = 0, Optik für Grundbild
Global Breite.d = 2.5
Global Hoehe.d = 2.5
Global XScale.d = Breite / ScreenX
Global YScale.d = -Hoehe / ScreenY
Global X1.d = XScale / Zoom
Global Y1.d = YScale / Zoom
Global MX.d
Global MY.d
Global VersatzX.d = 0.5                ;Versatz für Zentrierung
Global VersatzY.d = 0.0
Global Vier.d = 4.0

Global MaxIter.q = 255
Global Zoom_Max.q = 500000000000

Global IterCount.q
Global IterColor1.q
Global IterColor2.q
Global IterColor3.q
Global IterColor4.q

Global XStart.d = -1.4563337
Global YStart.d =  0.0

Global RohBuffer.q
Global ZeichenBuffer.q
Global ColorBuffer.q
Global X.d
Global XH1.d = 0.0
Global XH2.d = 256.0
Global XH3.d = 512.0
Global XH4.d = 768.0

Global Z1.d
Global Z2.d

Global D3.d = 3.0
Global D17.d = 17.0
Global D21.d = 21.0
Global D53.d = 53.0
Global D127.d = 127.0
Global D128.d = 128.0

#ScreenHeight = 1024.0
#ScreenWidth = 1024.0

!jmp OverASM
!SinAVX:
;this part is for reduce to 0-Pi/2 and set signum
!vmovsd xmm2,qword[Pi_Half]          ;load XMM2 with Pi_Half (1.570796326794896619)
!vdivsd xmm0,xmm0,xmm2               ;divide XMM0 (x=radiant) by XMM2 (Pi_Half), result in XMM0
!vcvttsd2si rax,xmm0                 ;convert the result (float double precision) to integer with truncation (like Int(x) in PB)
!vcvtsi2sd xmm1,xmm1,rax             ;convert the integer in RAX to float double precision in XMM1
!vsubsd xmm0,xmm0,xmm1               ;subtract XMM1 from XMM0, result in XMM0
!vmulsd xmm0,xmm0,xmm2               ;multiply XMM0 with XMM2 (Pi_Half), result in XMM0
!test rax,1                          ;test for quadrant 2 and 4 (6,8...)
!jz @f                               ;no
!@@:
!test rax,2                          ;test for quadrant 3 and 4 (and all negatives)
!jz @f                               ;no
!vxorpd xmm0,xmm0,[Minus]            ;change (set) bit 63 (signum)
!@@:
;calculate the first 4 terms (without start-value x), 1=0-63, 2=64-127, 3=128-191, 4=192-255
!vmovddup xmm0,xmm0                  ;XMM0: 1=x^1 2=x^1  duplicate bits 0-63 in bits 64-127
!vinsertf128 ymm1,ymm0,xmm0,1b       ;YMM1: 1=x^1 2=x^1 3=x^1 4=x^1  YMM1(0-127)=XMM0, YMM1(128-255)=XMM0
!vmulpd ymm2,ymm1,ymm1               ;YMM2: 1=x^2 2=x^2 3=x^2 4=x^2
!vmulsd xmm3,xmm2,xmm2               ;YMM3: 1=x^4 2=x^2 3=x^0 4=x^0
!vmulpd ymm4,ymm2,ymm2               ;YMM4: 1=x^4 2=x^4 3=x^4 4=x^4
!vmulpd ymm2,ymm4,ymm4               ;YMM2: 1=x^8 2=x^8 3=x^8 4=x^8  for the next 4 terms
!vmulpd ymm3,ymm3,ymm1               ;YMM3: 1=x^5 2=x^3 3=x^0 4=x^0
!vmulpd ymm1,ymm3,ymm4               ;YMM1: 1=x^9 2=x^7 3=x^0 4=x^0
!vperm2f128 ymm5,ymm1,ymm3,100000b   ;YMM5: 1=x^9 2=x^7 3=x^5 4=x^3  YMM5(0-127)=YMM1(0-127), YMM5(128-255)=YMM3(0-127)

!vmulpd ymm1,ymm5,yword[RezFak]      ;multiply the 4 values in YMM5 with RezFak -1/3! ... 1/9!, result in YMM1
;next 4 terms, without loop
!vmulpd ymm5,ymm5,ymm2               ;YMM5: 1=x^17 2=x^15 3=x^13 4=x^11
!vmulpd ymm3,ymm5,yword[RezFak+32]
;next 4 terms
!vmulpd ymm5,ymm5,ymm2               ;YMM5: 1=x^25 2=x^23 3=x^21 4=x^19
!vmulpd ymm3,ymm5,yword[RezFak+64]

!vhaddpd ymm2,ymm1,ymm1              ;YMM2: 1=1+2 of YMM1, 3=3+4 of YMM1
!vextractf128 xmm1,ymm2,1b           ;XMM1: 3+4 of YMM2
!vaddsd xmm3,xmm2,xmm1               ;XMM3: 1=sum of iterations

!vaddsd xmm0,xmm0,xmm3               ;XMM0: 1=sum of iterations plus start-value (1.term=x)

!vzeroupper                          ;set YMM0H-YMM15H to zero

!ret

!Minus:
!dq 8000000000000000h           ;for change (set) bit 63 (signum)
!Pi_Half:
!dq  1.570796326794896619
!RezFak:
!dq  2.755731922398589065e-6    ; 1/9!   4.Iteration
!dq -1.984126984126984127e-4    ;-1/7!   3.Iteration
!dq  8.333333333333333333e-3    ; 1/5!   2.Iteration
!dq -1.666666666666666667e-1    ;-1/3!   1.Iteration

!dq  2.811457254345520763e-15   ; 1/17!  8.Iteration
!dq -7.647163731819816476e-13   ;-1/15!  7.Iteration
!dq  1.605904383682161460e-10   ; 1/13!  6.Iteration
!dq -2.505210838544171878e-8    ;-1/11!  5.Iteration

!dq  6.446950284384473396e-26   ; 1/25!  12.Iteration
!dq -3.868170170630684038e-23   ;-1/23!  11.Iteration
!dq  1.957294106339126123e-20   ; 1/21!  10.Iteration
!dq -8.220635246624329717e-18   ;-1/19!  9.Iteration

Procedure Colors()
ColorBuffer = AllocateMemory(4 * MaxIter) ;2
!MOV rdx,[v_ColorBuffer]
!MOV rcx,[v_MaxIter]
!XOR r8,r8
!XOR r15,r15                         ;sicher ist sicher
!CVTSI2SD xmm6,[v_MaxIter]
!@@:
!CVTSI2SD xmm14,r8                   ;convert the integer in R8 to float double precision in XMM7
!DIVSD xmm14,xmm6
;- R
!MOVSD xmm0,xmm14
!MULSD xmm0,[v_D21]
!call SinAVX
!MULSD xmm0,[v_D127]
!CVTtSD2SI r15,xmm0                  ;convert the result (float double precision) to integer with truncation (like Int(x) in PB)
!SHL r15,8
;- G
!movsd xmm0,xmm14
!mulsd xmm0,[v_D17]
!call SinAVX
!mulsd xmm0,[v_D127]
!cvttsd2si rax,xmm0                  ;convert the result (float double precision) to integer with truncation (like Int(x) in PB)
!mov r15b,al
!shl r15,8
;- B
!movsd xmm0,xmm14
!mulsd xmm0,[v_D53]
!call SinAVX
!mulsd xmm0,[v_D127]
!cvttsd2si rax,xmm0                  ;convert the result (float double precision) to integer with truncation (like Int(x) in PB)
!mov r15b,al
;- Farbwert (32-Bit) für 1 Pixel in Grafikbuffer schreiben
!MOV [rdx],r15d                      ;0-7:B, 8-15:G, 16-23:R
!SUB rcx,1
!JNZ @b

EndProcedure

;Iterations-Schleife
;For Y = 0 To ScreenY - 1
!XOR r13,r13                         ;Zähler Iterationen
!MOV rcx,qword[v_RohBuffer]          ;Pointer in RohBuffer
!MOV r8,Start_Y                      ;Zähler äussere Schleife (Y-Koordinate)
!MULSD xmm0,qword[v_Y1]
!MOVQ qword[v_X],xmm0

!.Y_Loop:                              ;local label
;For X = 0 To ScreenX - 1
!XOR r9,r9                           ;Zähler innere Schleife (X-Koordinate)

!VMOVDDUP xmm1,qword[v_AchseY]       ;cX in ymm1=0-0-1-1

!VPERM2F128 ymm3,ymm1,ymm1,0         ;untere 128 Bit nach oben kopieren, cX in ymm3=1-2-1-2
!VSHUFPD ymm1,ymm3,ymm3,00001000b    ;beide unteren Doubles gleich, cX in ymm1=1-2-3-3

!.X_Loop:
!VMOVAPD ymm3,ymm2                 ;ymm3=zy*zy
!VMOVAPD ymm4,ymm1                 ;ymm4=zx*zx
!XOR rax,rax                       ;IterCounter=0
!MOV r10,00001111b                 ;Bit 0 bis 3 als Merker fertig
!XOR r11,r11                       ;enthält die vier Iterationstiefen (4 Words)
!MOV qword[r12],0

!.I_Loop:
;While (zx * zx + zy * zy) < 4 And (IterCounter < MaxIter)
;zx * zx
!VMULPD ymm5,ymm4,ymm4             ;ymm5=zx*zx
;zy * zy
!VMULPD ymm6,ymm3,ymm3             ;ymm6=zy*zy
;(zx * zx + zy * zy) < 4 ?
!VCMPNLTPD ymm0,ymm0,ymm14         ;ymm14 < ymm0 ?  ymm14=4*4.0
!VMOVMSKPD edx,ymm0                ;Bit1=Pixel  Bit0=Pixel+1
!AND edx,r10d
!JZ .AlleK                         ;alle kleiner
!CMP edx,00001111b
!JNE @f
!VMOVQ xmm0,rax
!VPSHUFLW xmm0,xmm0,0
!VMOVQ r11,xmm0                    ;R11 jetzt 4x Iterationstiefe (jeweils als Word)

!JMP .Farbgebung                   ;alle drüber
!@@:
!TEST edx,00000001b
!JZ @f
!MOV [r12+6],ax
!@@:                                   ;ist Bit1 gesetzt?
!TEST edx,00000010b
!JZ @f
!MOV [r12+4],ax
!@@:                                   ;ist Bit2 gesetzt?
!TEST edx,00000100b
!JZ @f
!MOV [r12+2],ax
!@@:                                   ;ist Bit3 gesetzt?
!TEST edx,00001000b
!JZ .AlleK
!MOV [r12+0],ax
!.AlleK:
!MOV r11,[r12]

!NOT edx
!AND r10d,edx
!AND r10d,00001111b
!JZ .Farbgebung                    ;das war´s schon

;zy = 2 * zx * zy + cY
!VMULPD ymm3,ymm3,ymm4
;zx = zx * zx - zy * zy + cX
!VSUBPD ymm4,ymm5,ymm6
!INC rax                           ;IterCounter
!CMP rax,255;[v_MaxIter]
!JB .I_Loop
;Wend

!CMP r10,00001111b
!JE .AlleX
!TEST r10,00000001b
!JZ @f
!MOV [r12+6],ax
!@@:                                   ;ist Bit1 gesetzt?
!TEST r10,00000010b
!JZ @f
!MOV [r12+4],ax
!@@:                                   ;ist Bit2 gesetzt?
!TEST r10,00000100b
!JZ @f
!MOV [r12+2],ax
!@@:                                   ;ist Bit3 gesetzt?
!MOV r11,[r12]
!TEST r10,00001000b
!JZ .Farbgebung
!JMP .Farbgebung

!.AlleX:
!VMOVQ xmm0,rax
!VPSHUFLW xmm0,xmm0,0
!VMOVQ r11,xmm0                    ;R11 jetzt 4x jeweilige Iterationstiefe (jeweils als Word)

!.Farbgebung:
!MOV [rcx],r11
!ADD rcx,8                         ;4 Punkte a 2 Byte
;innere Schleife X
;Next
!CMP r9,1024                       ;ScreenX
!JB .X_Loop
;äussere Schleife Y
;Next
!INC r8
!CMP r8,End_Y                        ;ScreenY
!JB .Y_Loop

!VZEROUPPER                          ;High von YMMx auf Null
EndMacro

!LEA r12,[v_IterColor1]
!MOVQ xmm0,qword[v_XH1]              ;einzeln nicht nötig
Thread(0, 0, 256)                    ;0=Beginn des Viertels im Grafik-Buffer, 0=Start Y-Koordinate, 256=Ende Y-Koordinate. Parameter anpassen für Anzahl Threads! For 1 Core: Thread(0, 0, 1024)
EndProcedure

!LEA r12,[v_IterColor2]
!MOVQ xmm0,qword[v_XH2]
Thread(524288, 256, 512)             ;524288=1024*2*256=Beginn des Viertels im Grafik-Buffer, 256=Start Y-Koordinate, 512=Ende Y-Koordinate
EndProcedure

!LEA r12,[v_IterColor3]
!MOVQ xmm0,qword[v_XH3]
Thread(1048576, 512, 768)            ;1048576=1024*2*512=Beginn des Viertels im Grafik-Buffer, 512=Start Y-Koordinate, 768=Ende Y-Koordinate
EndProcedure

!LEA r12,[v_IterColor4]
!MOVQ xmm0,qword[v_XH4]
Thread(1572864, 768, 1024)           ;1572864=1024*2*768=Beginn des Viertels im Grafik-Buffer, 768=Start Y-Koordinate, 1024=Ende Y-Koordinate
EndProcedure

Procedure Frac()

TEXT2\$ = "Zoom = " + StrD(Zoom, 2)

X1 = XScale / Zoom
AchseY = (-0.5 / Zoom) + XStart ;- VersatzX
Y1 = YScale / Zoom
AchseX = (1.25 / Zoom) + YStart ;VersatzY

StartDrawing(ScreenOutput())
ZeichenBuffer = DrawingBuffer()

IterCount = 0

QueryPerformanceCounter_(@Start)

;Analog

QueryPerformanceCounter_(@Ende)
QueryPerformanceFrequency_(@Freq)
Zeit = (Ende - Start) / Freq
ZeitG + Zeit

!MOV rcx,1024*1024                   ;Anzahl Bildpunkte
!MOV rdx,[v_ZeichenBuffer]
!MOV r9,[v_RohBuffer]
!MOV r11,[v_ColorBuffer]
!@@:
;- Iterationstiefen für 4 Punkte auslesen
!MOV r10,[r9]
!MOV r13,4
!LL:
!MOV r8,r10
!SHR r10,16
!AND r8,0FFFFh                       ;jede Iterationstiefe ist ein Word
!SHL r8,2
!MOV r15d,[r11+r8]
;- Farbwert (32-Bit) für 1 Pixel in Grafikbuffer schreiben
!MOV [rdx],r15d                      ;0-7:B, 8-15:G, 16-23:R
!SUB r13,1
!JNZ LL
;- 4 Punkte auf einmal
!SUB rcx,4
!JNZ @b

StopDrawing()
FlipBuffers()

SetGadgetText(7, "Zeit / Frame = " + StrD(Zeit) + " s")
Z1 = IterCount / Zeit / 1000000
SetGadgetText(6, "Iterationen = " + Str(IterCount) + " (" + StrD(Z1, 1) + " Mio/s)")
Z2 = 1 / Zeit
SetGadgetText(8, "Frames = " + StrD(Z2, 1) + " /s")
EndProcedure

!OverASM:

If OpenWindow(0, 0, 0, 1300, 1024, "Apfelmännchen-Test-AVX-4D-4C", #PB_Window_MinimizeGadget | #PB_Window_MaximizeGadget) = 0 Or InitSprite() = 0    ;InitSprite() für OpenWindowedScreen()
MessageRequester("Fehler!", "Hier stimmt was nicht!")
End
EndIf

TextGadget(1, 1030, 20, 45, 20, "")
TextGadget(2, 1030, 50, 250, 20, "")
TextGadget(3, 1030, 80, 150, 20, "Max. Iterationstiefe = " + Str(MaxIter))
TextGadget(4, 1030, 110, 250, 20, "")
TextGadget(5, 1030, 140, 250, 20, "")
TextGadget(6, 1030, 170, 250, 20, "")
TextGadget(7, 1030, 200, 200, 20, "")
TextGadget(8, 1030, 230, 260, 20, "")
TextGadget(9, 1030, 260, 260, 20, "")

FontHigh = Int(9.0 / (GetDeviceCaps_(GetDC_(WindowID(0)), #LOGPIXELSY) / 96.0))     ;Font anpassen
For i = 1 To 9  ;8
Next

If OpenWindowedScreen(WindowID(0), 0, 0, #ScreenWidth, #ScreenHeight, 0, 0, 0);, #PB_Screen_NoSynchronization)
Colors()
RohBuffer = AllocateMemory(1024 * 1024 * 2)

While Zoom < Zoom_Max
WindowEvent()
Zoom = (Zoom * 1.04)
Frac()
Wend

While Zoom >= 1.00
WindowEvent()
Zoom = (Zoom / 1.04)
Frac()
Wend

SetGadgetText(9, "Total Time = " + StrD(ZeitG) + " s")
Repeat
Delay(1)
Until WindowEvent() = #PB_Event_CloseWindow
EndIf

For change the number of cores see the remarks in line 316 and 360-367.
Have fun !
Helle

Top

 Post subject: Re: AVX - VMOVUPS returns 0Posted: Wed Jul 01, 2015 8:48 am
 Enthusiast

Joined: Tue May 28, 2013 10:51 pm
Posts: 540
Location: Europe
Helle wrote:
For this short code is multi-threading nonsense. But take a look at this (an old test from me, not perfect ):

/snip

For change the number of cores see the remarks in line 316 and 360-367.
Have fun !
Dear Helle,

wow, looks great (though it crashes for some reason randomly), I have much to read now!

Thanks once again for your assistance!

Bruno

_________________
"If you lie to the compiler, it will get its revenge."
Henry Spencer
https://www.pci-z.com/

Top

 Post subject: Re: AVX - VMOVUPS returns 0Posted: Wed Jul 01, 2015 8:52 am
 Enthusiast

Joined: Tue May 26, 2009 2:11 pm
Posts: 672
Hmmm...
Code:
Procedure Colors()
ColorBuffer = AllocateMemory(4 * MaxIter) ;2   <----- This is Line 130
!MOV rdx,[v_ColorBuffer]
!MOV rcx,[v_MaxIter]
!XOR r8,r8
!XOR r15,r15                         ;sicher ist sicher
!CVTSI2SD xmm6,[v_MaxIter]
!@@:
...
...
EndProcedure

Debugger wrote:
[09:43:46] Warte auf den Start des Executable...
[09:43:46] Executable-Typ: Windows - x64 (64bit)
[09:43:46] Executable gestartet.
[09:43:46] [ERROR] Zeile: 130
[09:43:46] [ERROR] Illegale Anweisung. (Ausführen von binären Daten?)

What can be done to avoid this illegal instruction message?

_________________

Top

 Post subject: Re: AVX - VMOVUPS returns 0Posted: Wed Jul 01, 2015 9:07 am
 Enthusiast

Joined: Wed Apr 12, 2006 7:59 pm
Posts: 174
Location: Germany
"Zeile 130" is a dummy from PB/FAsm. Change the FAsm.exe (directory .../PureBasic/Compilers) with the newest version.

Top

 Post subject: Re: AVX - VMOVUPS returns 0Posted: Wed Jul 01, 2015 9:13 am
 Enthusiast

Joined: Tue May 26, 2009 2:11 pm
Posts: 672
Helle wrote:
"Zeile 130" is a dummy from PB/FAsm. Change the FAsm.exe (directory .../PureBasic/Compilers) with the newest version.
Shouldn't PB 5.31(x64) have the newest FAsm.exe?
My version shows: ‎14. ‎Mai ‎2012, ‏‎13:54:06

_________________

Top

 Post subject: Re: AVX - VMOVUPS returns 0Posted: Wed Jul 01, 2015 9:31 am
 Enthusiast

Joined: Wed Apr 12, 2006 7:59 pm
Posts: 174
Location: Germany
[url]
[/url]
Date: 23.3.2015

Top

 Post subject: Re: AVX - VMOVUPS returns 0Posted: Wed Jul 01, 2015 5:13 pm
 Enthusiast

Joined: Tue May 26, 2009 2:11 pm
Posts: 672
Hello Helle!
Helle wrote:
[url]
[/url]
Date: 23.3.2015

Looks like my CPU isn't supporting AVX according to your nice little program
I've found here: http://www.purebasic.fr/german/viewtopi ... 9&start=17

Is an i5 760 already that outdated?

_________________

Top

 Post subject: Re: AVX - VMOVUPS returns 0Posted: Wed Jul 01, 2015 6:41 pm
 Enthusiast

Joined: Wed Apr 12, 2006 7:59 pm
Posts: 174
Location: Germany
Look at http://ark.intel.com and then " Previous Generation Intel Core i5 Processor". No AVX, only SSE4.2. And its a long time to Christmas...

Top

 Post subject: Re: AVX - VMOVUPS returns 0Posted: Wed Jul 01, 2015 7:02 pm
 Enthusiast

Joined: Tue May 26, 2009 2:11 pm
Posts: 672
Helle wrote:
...
And its a long time to Christmas...

I don't think that I will reach that Christmas.

_________________

Top

 Display posts from previous: All posts1 day7 days2 weeks1 month3 months6 months1 year Sort by AuthorPost timeSubject AscendingDescending
 Page 1 of 1 [ 14 posts ]

 All times are UTC + 1 hour

#### Who is online

Users browsing this forum: No registered users and 2 guests

 You cannot post new topics in this forumYou cannot reply to topics in this forumYou cannot edit your posts in this forumYou cannot delete your posts in this forum

Search for:
 Jump to:  Select a forum ------------------ PureBasic    Coding Questions    Game Programming    3D Programming    Assembly Programming    The PureBasic Editor    The PureBasic Form Designer    General Discussion    Feature Requests and Wishlists    Tricks 'n' Tips Bug Reports    Bugs - Windows    Bugs - Linux    Bugs - Mac OSX    Bugs - IDE    Bugs - Documentation OS Specific    AmigaOS    Linux    Windows    Mac OSX Miscellaneous    Announcement    Off Topic Showcase    Applications - Feedback and Discussion    PureFORM & JaPBe    TailBite