Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AVX2 disabled on Windows as workaround for gcc bug #295

Open
sbabic44 opened this issue Dec 5, 2020 · 19 comments
Open

AVX2 disabled on Windows as workaround for gcc bug #295

sbabic44 opened this issue Dec 5, 2020 · 19 comments
Labels
core Core application logic performance Something is too slow portability Stuff that doesn't work on some OS

Comments

@sbabic44
Copy link

sbabic44 commented Dec 5, 2020

On Keysight 3000T, software throws shader error and quits.
PC is Windows, Intel 4600 HD, OpenGL 4.3. latest driver..
Errors in att..

Error 450.txt

Tried changing GLSL version: 450 to 430

Error 430.txt

@azonenberg
Copy link
Collaborator

Looks like the 4600 HD doesn't support 64-bit integer arithmetic (GL_ARB_gpu_shader_int64). Do you have a machine with a discrete graphics card you can try?

Intel integrated cards have been lagging in support for a while, but even fairly old NVidia/AMD discrete GPUs should work.

@sbabic44
Copy link
Author

sbabic44 commented Dec 5, 2020

Yes, I'm afraid I will need to upgrade..
Will try on laptop it has Nvidia..
And that was freaky fast answer, respect !!
Anyways, I'll try and report back..
Reagrds,

@sbabic44
Copy link
Author

sbabic44 commented Dec 5, 2020

Yes, that was that, i managed to start it, Now I get other error:
GetConsoleScreenBufferInfo() failed (6);
When I try to start hystogram or FFT or any graphical analysis.
Statistics and text measurements work fine.

@azonenberg
Copy link
Collaborator

This is on Windows, I assume? Can you post a screenshot of the error showing both the glscopeclient window and the console?

@bvernoux
Copy link
Contributor

bvernoux commented Dec 6, 2020

I confirm on my side glscopeclient "crash" with FFT too with Windows build it seems related to alignment to be confirmed...
https://github.com/anthonix/ffts is used in glscopeclient and is known to have some https://github.com/anthonix/ffts/issues in fact and seems dead (not maintained) since few years, latest commit was on Jun 17, 2017.
So far I was searching an alternative but it is not easy to find something good fast/multi-platform with open license like MIT or BSD license...
For information the https://github.com/linkotec/ffts fork which is more recent does not build correctly with MSYS2 mingw64 ...

@sbabic44
Copy link
Author

sbabic44 commented Dec 6, 2020

Yes it crashes with Segmentation fault:

sinisa@t440p MINGW64 ~
$ glscopeclient scoop:agilent:lan:3104t.:5025
GetConsoleScreenBufferInfo() failed (6);
Warning: glscopeclient works best with the OMP_WAIT_POLICY environment variable
set to PASSIVE
Segmentation fault

sinisa@t440p MINGW64 ~

Main screen closes, nothing to see there.

I tried FFT, histograms, and few more and it crashes.

@bvernoux
Copy link
Contributor

bvernoux commented Dec 6, 2020

This issue shall be probably renamed as it is an issue with FFTS for Windows with MSYS2 mingw64 it seems it does not affect GNU/Linux to be confirmed
I have fixed that issue it was an issue to compute the next highest power of 2 see https://github.com/azonenberg/scopehal/pulls
So now all crash related to ffts are fixed it was impacting following scopehal classes:

  • JitterSpectrumFilter
  • FFTFilter
  • DeEmbedFilter
  • TestWaveformSource (used in demo mode)

@sbabic44 sbabic44 changed the title Shader error, software aborts Keysight 3000T Issue with FFTS for Windows with MSYS2 mingw64 (Keysight 3000T) Dec 7, 2020
bvernoux added a commit to bvernoux/scopehal that referenced this issue Dec 7, 2020
Compute next highest power of 2 to fix issues when using code (in different parts):
  const size_t npoints = pow(2, ceil(log2(depth))); (see scopehal\TestWaveformSource.cpp => TestWaveformSource::DegradeSerialData()...)
The original code (pow(2, ceil(log2())) has some border effects when built with MSYS2 mingw64 "Release" mode as it does not compute correctly the highest power of 2 for example with parameter 100000 it was returning 131071 (instead of 131072) which crashed ffts.
This implementation fix issue ngscopeclient/scopehal-apps#295
The function next_pow2() shall replace pow(2, ceil(log2())) in following code:
scopehal\TestWaveformSource.cpp
	Line 282: 	//const size_t npoints = pow(2, ceil(log2(depth)));
scopeprotocols\DeEmbedFilter.cpp
	Line 230: 	const size_t npoints = pow(2, ceil(log2(npoints_raw)));
scopeprotocols\FFTFilter.cpp
	Line 179: 	const size_t npoints = pow(2, ceil(log2(npoints_raw)));
scopeprotocols\JitterSpectrumFilter.cpp
	Line 195: 	const size_t npoints = pow(2, ceil(log2(npoints_raw)));
bvernoux added a commit to bvernoux/scopehal that referenced this issue Dec 7, 2020
Compute next highest power of 2 to fix issues when using code (in different parts):
const size_t npoints = pow(2, ceil(log2(depth))); (see scopehal\TestWaveformSource.cpp => TestWaveformSource::DegradeSerialData()...)
The original code (pow(2, ceil(log2())) has some border effects when built with MSYS2 mingw64 "Release" mode as it does not compute correctly the highest power of 2 for example with parameter 100000 it was returning 131071 (instead of 131072) which crashed ffts.
This implementation fix issue ngscopeclient/scopehal-apps#295
bvernoux added a commit to bvernoux/scopehal that referenced this issue Dec 7, 2020
Compute next highest power of 2 to fix issues when using code (in different parts):
const size_t npoints = pow(2, ceil(log2(npoints_raw)));
The original code (pow(2, ceil(log2())) has some border effects when built with MSYS2 mingw64 "Release" mode as it does not compute correctly the highest power of 2 for example with parameter 100000 it was returning 131071 (instead of 131072) which crashed ffts.
This implementation fix issue ngscopeclient/scopehal-apps#295
bvernoux added a commit to bvernoux/scopehal that referenced this issue Dec 7, 2020
Compute next highest power of 2 to fix issues when using code:
const size_t npoints = pow(2, ceil(log2(npoints_raw)));
The original code (pow(2, ceil(log2())) has some border effects when built with MSYS2 mingw64 "Release" mode as it does not compute correctly the highest power of 2 for example with parameter 100000 it was returning 131071 (instead of 131072) which crashed ffts.
This implementation fix issue ngscopeclient/scopehal-apps#295
bvernoux added a commit to bvernoux/scopehal that referenced this issue Dec 7, 2020
…uild

Compute next highest power of 2 to fix issues when using code:
const size_t npoints = next_pow2(npoints_raw);
The original code (pow(2, ceil(log2())) has some border effects when built with MSYS2 mingw64 "Release" mode as it does not compute correctly the highest power of 2 for example with parameter 100000 it was returning 131071 (instead of 131072) which crashed ffts.
This implementation fix issue ngscopeclient/scopehal-apps#295
@bvernoux
Copy link
Contributor

bvernoux commented Dec 7, 2020

I confirm this issue is fixed now with latest master since 27078c0

@azonenberg
Copy link
Collaborator

@sbabic44 can you please test and confirm this is fixed?

Also, I added a backward compatibility version of the shaders which does not require GL_ARB_gpu_shader_int64, please test on your 4600 HD and see if it works there too?

@sbabic44
Copy link
Author

sbabic44 commented Dec 14, 2020

Andrew,
thanks for the effort. It actually started with this on command line:

`sinisa@WKS MINGW64 ~/ffts/build/scopehal-apps/build
$ glscopeclient --debug myscope:demo:null:null
GetConsoleScreenBufferInfo() failed (6);
Warning: glscopeclient works best with the OMP_WAIT_POLICY environment variable
set to PASSIVE
Detecting CPU features...
* AVX2

Warning: Warning: Can't parse preference value 13,370000 for preference hidden_s
etting, ignoringWarning: Warning: Can't parse preference value 42,090000 for preference test_rea
l, ignoring
`
If I try to create FFT on one of channels, or create histogram it still drops out with no warning....

But shaders work with 4600HD on win10 now just fine..

Best regards

@azonenberg azonenberg reopened this Dec 14, 2020
@azonenberg
Copy link
Collaborator

So looks like we still have a problem.

Additionally, it seems like the preference parsing is getting messed up by locales that uses commas as a decimal separator. Let me file a separate ticket for that.

@azonenberg
Copy link
Collaborator

@sbabic44 Can you try with latest code and see how it works? Try with and without the --noopencl command line argument.

@someone--else
Copy link
Contributor

On rev 3e9523c (current master) / Windows segfault happens immediately after doing RF->FFT on any demo scope waveform (see call stack below)

Console output:

Detecting CPU features... * AVX2 OpenCL support: not present at compile time. GPU acceleration disabled. Context: OpenGL 4.2 compatibility profile GL_VENDOR = NVIDIA Corporation GL_RENDERER = Quadro T1000/PCIe/SSE2 GL_VERSION = 4.2.0 NVIDIA 452.66 GL_SHADING_LANGUAGE_VERSION = 4.20 NVIDIA via Cg compiler Initial GL error code = 0 GL_ARB_gpu_shader_int64 = supported

Call stack:
(segfault happens on this line: __m256 vcos = _mm256_cos_ps(vscale); at i == 0, data/out seems aligned properly)

libscopeprotocols.dll!FFTFilter::CosineSumWindowAVX2(const float * data, size_t len, float * out, float alpha0) (c:\apps\msys64\home\root\gls\scopehal-apps\lib\scopeprotocols\FFTFilter.cpp:714)
libscopeprotocols.dll!FFTFilter::HammingWindow(const float * data, size_t len, float * out) (c:\apps\msys64\home\root\gls\scopehal-apps\lib\scopeprotocols\FFTFilter.cpp:822)
libscopeprotocols.dll!FFTFilter::ApplyWindow(const float * data, size_t len, float * out, FFTFilter::WindowFunction func) (c:\apps\msys64\home\root\gls\scopehal-apps\lib\scopeprotocols\FFTFilter.cpp:672)
libscopeprotocols.dll!FFTFilter::DoRefresh(FFTFilter * const this, AnalogWaveform * din, std::vector<EmptyConstructorWrapper, AlignedAllocator<EmptyConstructorWrapper, 64> > & data, double fs_per_sample, size_t npoints, size_t nouts, bool log_output) (c:\apps\msys64\home\root\gls\scopehal-apps\lib\scopeprotocols\FFTFilter.cpp:439)
libscopeprotocols.dll!FFTFilter::Refresh(FFTFilter * const this) (c:\apps\msys64\home\root\gls\scopehal-apps\lib\scopeprotocols\FFTFilter.cpp:315)
libscopehal.dll!Filter::RefreshIfDirty(Filter * const this) (c:\apps\msys64\home\root\gls\scopehal-apps\lib\scopehal\Filter.cpp:191)
OscilloscopeWindow::_ZN18OscilloscopeWindow17RefreshAllFiltersEv._omp_fn.0(void)() (c:\apps\msys64\home\root\gls\scopehal-apps\src\glscopeclient\OscilloscopeWindow.cpp:3081)
libgomp-1.dll![Unknown/Just-In-Time compiled code] (Unknown Source:0)
OscilloscopeWindow::RefreshAllFilters(OscilloscopeWindow * const this) (c:\apps\msys64\home\root\gls\scopehal-apps\src\glscopeclient\OscilloscopeWindow.cpp:3079)
OscilloscopeWindow::OnAllWaveformsUpdated(OscilloscopeWindow * const this, bool reconfiguring) (c:\apps\msys64\home\root\gls\scopehal-apps\src\glscopeclient\OscilloscopeWindow.cpp:2913)
OscilloscopeWindow::PollScopes(OscilloscopeWindow * const this) (c:\apps\msys64\home\root\gls\scopehal-apps\src\glscopeclient\OscilloscopeWindow.cpp:2852)
OscilloscopeWindow::OnTimer(OscilloscopeWindow * const this) (c:\apps\msys64\home\root\gls\scopehal-apps\src\glscopeclient\OscilloscopeWindow.cpp:493)
sigc::bound_mem_functor1<bool, OscilloscopeWindow, int>::operator()(const sigc::bound_mem_functor1<bool, OscilloscopeWindow, int> * const this, sigc::type_trait_take_t _A_a1) (c:\apps\msys64\mingw64\include\sigc++-2.0\sigc++\functors\mem_fun.h:2066)
sigc::adaptor_functor<sigc::bound_mem_functor1<bool, OscilloscopeWindow, int> >::operator()<int&>(const sigc::adaptor_functor<sigc::bound_mem_functor1<bool, OscilloscopeWindow, int> > * const this, int & _A_arg1) (c:\apps\msys64\mingw64\include\sigc++-2.0\sigc++\adaptors\adaptor_trait.h:89)
sigc::bind_functor<-1, sigc::bound_mem_functor1<bool, OscilloscopeWindow, int>, int, sigc::nil, sigc::nil, sigc::nil, sigc::nil, sigc::nil, sigc::nil>::operator()(sigc::bind_functor<-1, sigc::bound_mem_functor1<bool, OscilloscopeWindow, int>, int, sigc::nil, sigc::nil, sigc::nil, sigc::nil, sigc::nil, sigc::nil> * const this) (c:\apps\msys64\mingw64\include\sigc++-2.0\sigc++\adaptors\bind.h:1124)
sigc::internal::slot_call0<sigc::bind_functor<-1, sigc::bound_mem_functor1<bool, OscilloscopeWindow, int>, int, sigc::nil, sigc::nil, sigc::nil, sigc::nil, sigc::nil, sigc::nil>, bool>::call_it(sigc::internal::slot_rep * rep) (c:\apps\msys64\mingw64\include\sigc++-2.0\sigc++\functors\slot.h:136)
libglibmm-2.4-1.dll![Unknown/Just-In-Time compiled code] (Unknown Source:0)
ScopeApp::run(ScopeApp * const this, std::vector<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::allocator<std::__cxx11::basic_string<char, std::char_traits, std::allocator > > > filesToLoad, bool reconnect, bool nodata, bool retrigger, bool nodigital, bool nospectrum) (c:\apps\msys64\home\root\gls\scopehal-apps\src\glscopeclient\ScopeApp.cpp:120)
main(int argc, char ** argv) (c:\apps\msys64\home\root\gls\scopehal-apps\src\glscopeclient\main.cpp:289)

@someone--else
Copy link
Contributor

Turning off AVX2 via g_hasAvx2 solves the problem

Might be related to this: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412

@azonenberg
Copy link
Collaborator

Rather than turning off AVX2 on Windows, what happens if you add -mavx to the compile command line? AVX1 should be pretty well supported and it might change the ABI enough to make stuff align correctly.

@someone--else
Copy link
Contributor

Same segfault with -mavx

I think using clang instead of gcc might help, but not sure how to set CMake up for it to test

@azonenberg
Copy link
Collaborator

azonenberg commented Mar 22, 2022

Just checking in on this: after ngscopeclient/scopehal#484, is this issue still relevant or can we close it?

@someone--else
Copy link
Contributor

Linked GCC issue seems to be still open, so turning AVX off on Windows as a temp workaround is still relevant and shouldn't be reverted. Perhaps we can have a separate issue dedicated to reverting it when GCC is fixed?

There is also a way to fix this without fixing GCC by moving all relevant stack variables to classes or heap and aligning them manually, but I don't know if it's worth it since code quality will suffer as a result

@azonenberg
Copy link
Collaborator

Hmm. Let me rename this issue then to reflect the true situation.

@azonenberg azonenberg added core Core application logic performance Something is too slow portability Stuff that doesn't work on some OS and removed bug Something isn't working labels Mar 22, 2022
@azonenberg azonenberg changed the title Issue with FFTS for Windows with MSYS2 mingw64 (Keysight 3000T) AVX2 disabled on Windows as workaround for gcc bug Mar 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core application logic performance Something is too slow portability Stuff that doesn't work on some OS
Projects
None yet
Development

No branches or pull requests

4 participants