Build/Compile OpenCV v3.2 on Windows with CUDA 8.0 and Intel MKL+TBB

Posted on Posted in CUDA, OpenCV

 

 
OpenCV 3.3 was released on 03/08/2017, for instructions on building that on Windows with CUDA 8.0 and Intel MKL+TBB go here.
 
Because the pre-built Windows libraries available for OpenCV v3.2 do not include the CUDA modules, I have provided them for download here, and included the build instructions below for anyone who is interested.

The guide below details instructions on compiling the 64 bit version of OpenCV v3.2 shared libraries with Visual Studio 2013 (will also work with Visual Studio 2015 if selected in CMake), CUDA 8.0, and support for both the Intel Math Kernel Libraries (MKL) and Intel Threaded Building Blocks (TBB). The guide is split into two parts, the first to follow if you just want to compile OpenCV v3.2 with CUDA support and the second if you want to include the Intel performance libraries, SIMD optimizations and Eigen integration.

Note: The procedure outlined will not work for Visual Studio 2017 because this is not supported by the CUDA 8.0 Toolkit.

For Visual Studio 2013 and 2015 you first need to:

  • Download the source files, available on GitHub. Either clone the git repo making sure to checkout the 3.2.0 tag or download this archive containing all the source file.
  • Install CMake – Version 3.7.1 is used in the guide.
  • Install The CUDA 8.0 Toolkit.
  • Optional – Install both the Intel MKL and TBB by registering for community licensing, and downloading for free.
  • Optional – Download and extract the Eigen C++ template library for linear algebra.

 

Building OpenCV v3.2 with CUDA 8.0
  1. Fire up CMake, making sure that the Grouped checkbox is ticked, select the location of the source files downloaded from GitHub and the location where the build will take place.
     

     
  2. Click the Configure button and select Visual Studio 2013 Win64 (32 bit CUDA support is limited). This may take a while as CMake will download ffmpeg and the Intel Integrated Performance Primitives for Image processing and Computer Vision (IPP-ICV).
     

     
  3. Expand the BUILD group, untick BUILD_DOCS (requires additional dependencies, and can be downloaded from here) and tick BUILD_opencv_world (builds to a single dll). Your only unticked build options should be the ones shown in the image below.
     

     
  4. Expand the CUDA tab, the CUDA_TOOLKIT_ROOT_DIR should point to your CUDA 8.0 toolkit installation, if you have more than one version of the toolkit installed and it has picked that one then simply change the path to point to CUDA 8.0.

    The default CUDA_ARCH_BIN option is to build microcode for all architectures from 2.0-6.1 (FermiPascal). This setting results in a large build time (~3.5hours on an i7) but the binaries produced will run on all supported devices. If you only want to execute OpenCV on a specific device then only enter the compute capability of that device here, remember that this the produced libraries are not guaranteed to run on any device’s of a different major compute version to the ones entered, see the CUDA C Programming Guide for details.

    If you are comfortable with the implications, you can also enable CUDA_FAST_MATH which will enable the –use_fast_math compiler option, again see CUDA C Programming Guide for details.
     

     

  5. Expand WITH and enable WITH_CUBLAS to enable the CUDA Basic Linear Algebra Subroutines (cuBLAS).
     

     
  6. If you want to include the Intel performance libraries, SIMD optimizations and Eigen integration, go to Including Intel MKL, TBB and other optional components, before proceeding.
  7. Press Configure again, ensure that there are no warning messages in red in the configuration window. If there are then the Visual Studio solution may be generated but it it will probably fail to build.
  8. Press Generate and wait until the bottom of the window indicates success.
     

     
  9. Press Open Project (not available in older versions of CMake, for those just locate and open the Visual Studio solution file) to open up the solution in Visual Studio.
     

     
  10. Click Solution Explorer, expand CMakeTargets, right click on INSTALL and click Build. This will both build the library and copy the necessary redistributable parts to the install directory, D:/OpenCV/build/install in this example. If everything was successful, congratulations, you now have OpenCV v3.2 built with CUDA 8.0.
     

     
  11. NOTE: If you change remove any options after pressing Configure a second time, the build may fail, it is best to remove build directory and start again. This may seem over cautions but it is preferable to waiting for an hour for the build to fail and then starting again.

 

Including Intel MKL, TBB and other optional components
  1. If MKL and TBB are installed correctly the path to these should have been picked up in CMake (see below), all that is required is to enable TBB by ticking MKL_WITH_TBB.
     

     
    and enable MKL by ticking HAVE_MKL.
     

     
  2. To build with SIMD optimizations enable the options shown below.
     

     
  3. To build with Eigen set EIGEN_INCLUDE_PATH to the directory you extracted the archive to.
     

     
  4. Press Configure and verify that the optional components have been selected, then proceed as above.
     

52 thoughts on “Build/Compile OpenCV v3.2 on Windows with CUDA 8.0 and Intel MKL+TBB

  1. Thank you for this detailed explanation. I switched from opencv 2.x to opencv 3.2 recently. I didn’t realise that the pre-built opencv 3.2 libary doesn’t include CUDA part of the code when I was downloading it, so I have build it by myself. This article has saved my life and time!

  2. Hello.
    Can you give me a OpenCV v3.2 with CUDA 8.0 compiled for Windows x86 (with the limitations, don’t matter), also if this is posible, compiled without CUDA for x86 (because official opencv distro only include binaries without CUDA for x64).
    I need a 32bits version of my app that use OpenCV, and it have the option to enable/disable CUDA on execution. Thanks a lot. I will appreciate that.

    1. Hi,

      I have uploaded 32 bit versions of OpenCV compiled without CUDA for both VS2013 and VS2015 to the downloads page.

      To compile a 32 bit version of OpenCV with CUDA v8.0, you would need to remove all OpenCV CUDA modules which use the CUDA NPP libraries. From a quick inspection of the output from cmake, this may be all of them. If you really need a 32 bit version of OpenCV with all the CUDA modules then I would suggest trying to compile it with CUDA v6.5, although I am not sure if this will work with OpenCV v3.2.

  3. Just wondering, will all of the above work with VS 2017 x64 compiler?
    So far getting build errors in opencv_core or opencv_world depending on whether opencv_world is enabled, with VS just casually telling me “cmd.exe exited with code 1” without any extra info really.

    1. I don’t think so. Nvidia does not list Visual Studio 2017 as one of its supported compilers for CUDA 8.0.

      I think you will have to wait until the next version of CUDA is released. If you only have VS 2017 you can try to generate a VS 2015 solution and build it in VS 2017 with the VS 2015 tool set, see this blog post. However you would then need to use the VS 2015 tool set for all projects that depend on your OpenCV build.

      1. Well yeah, but then the add-in won’t work. Just had to get myself full VS2015 IDE and use that one for now. Sad that NVIDIA is just a small company and cant afford to keep up with the updates every two months.

        Ty for the guide above. Regards.

  4. Hi
    Thanks for descriptive tutorials. But build does not contain opencv_traincascade executable and other executable which are helpful to training. Can you provide for opencv 3.2 , visual studio 2015 and cuda 8.0

    1. Hi, I can and probably will include the additional binaries in the future, however they are available as part of the standard OpenCV 3.2 download.

      The binaries from the above link are compiled with VS2015 (although I wouldn’t think it should matter, because they are stand alone exe’s) and as far as I know do not run on the GPU.

        1. Hi, I update the download page to included all the binaries, however I do not think this will help you. I have quickly browsed the code in the opencv_traincascade project and there does not appear to be calls to any GPU accelerated functions, or any cuda source code in the project itself.

  5. Hi, a have a problem. When I am building the INSTALL project after about 30-45 minutes of work VS starting to giving me about 45 errors like “LNK 1104: cannot open file “..\..\lib\opencv_world320d.lib”.

    What can I do with that?

    1. Hi,
      Do you see any errors relating to the build of opencv_world320d.lib failing before the link errors you describe? Can you send me the output from the build window?
      Are you using VS2013 or VS2015, and are you building with CUDA?

      1. First error:
        > C:\Program Files (x86)\Windows Kits\8.1\Include\um\combaseapi.h(229): error : identifier “IUnknown” is undefined
        9>
        9>C:\Program Files (x86)\Windows Kits\8.1\Include\um\combaseapi.h(229): warning : expression has no effect
        9>
        9>D:\OCV\opencv-3.2.0\modules\core\include\opencv2/core/base.hpp(365): warning : function declared with “noreturn” does return
        9>
        9> 1 error detected in the compilation of “C:/Users/tomus/AppData/Local/Temp/tmpxft_000018ac_00000000-6_gftt.cpp4.ii”.
        9>CUSTOMBUILD : nvcc warning : The ‘compute_20’, ‘sm_20’, and ‘sm_21’ architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
        9> gftt.cu
        9> CMake Error at cuda_compile_generated_gftt.cu.obj.cmake:264 (message):
        9> Error generating file
        9> D:/OCV/build2/modules/world/CMakeFiles/cuda_compile.dir/__/cudaimgproc/src/cuda/Debug/cuda_compile_generated_gftt.cu.obj

        1. Hi, this looks to be an error which people have experienced on Windows 7 and/or earlier versions of Visual Studio, related to the Windows SDK. You could try adding WIN32_LEAN_AND_MEAN as a global processor in all effected projects as suggested
          https://devtalk.nvidia.com/default/topic/391375/cuda-programming-and-performance/-quot-identifier-quot-iunknown-quot-is-undefined-quot-error-vista-visual-studio-2005/ but I cannot guarantee it will help.
          Are you compiling on Windows 7?

  6. Thanks a lot!!! I was having problems building for Visual Studio 14 2015 when changed to 64 bits it worked!!

      1. See (2), and instead of selecting Visual Studio 12 2013 Win64, select Visual Studio 14 2015 Win64.

  7. Hi,
    very good blog about a topic, which is complicated to me.
    I needed the WITH_MSMF code compiled, for better Ms Lifecam Studio. The standard Open_CV3.2 package only supports my webcam very basic, so the automatic exposure is off.
    Did U integrate MSMF support in your package James ?

    1. Hi, sorry I did not build any of the binaries available on the download page with the WITH_MSMF flag enabled. I would hope that you could just enable that flag before building but I have not tested this myself, so I cannot confirm that this will work.

  8. well, thanks to you, now I´ve got the opencv_ffmpeg320_64.dll 🙂
    —————————————————————————————–
    I tried to build with cmake and VS 2015 community, but LAPACK and MKL always error.
    I just got one error left, didn´t get rid of it.
    So I left Intel-libs aside and just tried CUDA : No errors in CMAKE, 40 errors in VS2015.
    maybe i should check the intel-settings in VS again, before I start building.

    1. Hi Andre, are you compiling with WITH_MSMF flag? If so have you tried without to verify that this works?
      When you say
      “I tried to build with cmake and VS 2015 community, but LAPACK and MKL always error.”
      is that just in cmake with the WITH_MSMF box ticked, if so what are the errors reported by cmake?

  9. LAPACK(OpenBLAS): LAPACK_LIBRARIES: C:/opencv-master/OpenBLAS-v0.2.14-Win64-int64/lib/libopenblas.a;C:/opencv-master/OpenBLAS-v0.2.19-Win64-int32/lib/libopenblas.dll.a
    LAPACK(OpenBLAS): Can’t build LAPACK check code. This LAPACK version is not supported.

    already tried 3 versions for LAPACK, dont matter what , always : Can’t build LAPACK check code. This LAPACK version is not supported.

    On Win7.

    1. Hi, have you downloaded and installed the Intel MKL, and are they located in the C:/Program Files (x86)/IntelSWTools/compilers_and_libraries/windows/mkl/ directory? If so which versions are you using, 2017 update 1?

    1. Hello James,
      CUDA NVCC target flags: -gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-D_FORCE_INLINES
      LAPACK(MKL): LAPACK_LIBRARIES: C:/Program Files (x86)/IntelSWTools/compilers_and_libraries_2017/windows/mkl/lib
      WARNING: Target “cmTC_426db” requests linking to directory “C:/Program Files (x86)/IntelSWTools/compilers_and_libraries_2017/windows/mkl/lib”. Targets may link only to libraries. CMake is dropping the item.
      WARNING: Target “cmTC_426db” requests linking to directory “C:/Program Files (x86)/IntelSWTools/compilers_and_libraries_2017/windows/mkl/lib”. Targets may link only to libraries. CMake is dropping the item.
      WARNING: Target “cmTC_426db” requests linking to directory “C:/Program Files (x86)/IntelSWTools/compilers_and_libraries_2017/windows/mkl/lib”. Targets may link only to libraries. CMake is dropping the item.
      WARNING: Target “cmTC_426db” requests linking to directory “C:/Program Files (x86)/IntelSWTools/compilers_and_libraries_2017/windows/mkl/lib”. Targets may link only to libraries. CMake is dropping the item.
      LAPACK(MKL): Can’t build LAPACK check code. This LAPACK version is not supported.
      -this time with the version you recommended.-
      Other third-party libraries:
      Use Intel IPP: 2017.0.2 [2017.0.2]
      at: C:/opencv-master/build/VS2015/x64/CUDA_V8_0/3rdparty/ippicv/ippicv_win
      Use Intel IPP IW: prebuilt binaries (2017.0.2)
      Use Intel IPP Async: NO
      Use Lapack: NO
      Use Eigen: YES (ver 3.3.90)
      Use Cuda: YES (ver 8.0)
      Use OpenCL: YES
      Use OpenVX: NO
      Use custom HAL: NO

      NVIDIA CUDA
      Use CUFFT: YES
      Use CUBLAS: YES
      USE NVCUVID: NO
      NVIDIA GPU arch: 50 52 60 61
      NVIDIA PTX archs:
      Use fast math: YES

      1. Hi Andre,
        Are you using OpenCV v3.2.0 commit 70bbf17b133496bd7d54d034b0f94bd869e0e810?
        I am asking because I would expect your output in CMake of
        LAPACK(MKL): LAPACK_LIBRARIES: …
        to be
        LAPACK_IMPL: MKL, LAPACK_LIBRARIES: …
        I have just checked and LAPACK(MKL): is the CMake output in v3.3.0-rc1, is there any chance you using the release candidate for v3.3.0?

        Do you have the MKL libraries located at C:/Program Files (x86)/IntelSWTools/compilers_and_libraries/?
        I have checked w_mkl_2017.3.210.exe on my system and CMake finds the libraries without any problems. C:/Program Files (x86)/IntelSWTools/compilers_and_libraries/ should be a shortcut to C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2017 and the MKL libs contained inside C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2017\windows\mkl should be a shortcut to C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2017.4.210\windows\mkl.

        1. Hi James, yes I used the rc-version. Now, 3.2,configuring and generating without warnings and WITH_MSMF checked, Vs gives me errors and warnings and compiles FFmpeg Dll only. I just chose the 4 latest Nvidia architectures, do you thing this might cause problems?

          1. Hi Andre,
            The FFmpeg.dll is not compiled, CMake downloads it when you first press configure. What errors are you seeing in Visual Studio?
            If I were you I would start again (delete all files from your build directory) and try to compile OpenCV without CUDA or WITH_MSMF, to make sure this compiles. To do that you should just need to press configure in CMake, wait for everything to be downloaded, untick all CUDA options under both BUILD and WITH (maybe remove BUILD_opencv_world), configure and generate.

  10. Hi James, first thank you for this article. I couldn’t get my own build to install without errors so I have decided to try to use your download file for OpenCV w/CUDA for Visual studio 2013. Inside of the download file there is an Install folder. What are the steps from downloading your file to being able to import OpenCV Libraries in a cuda project?

    1. Hi, to use the OpenCV functions, all you need to do is link to opencv_world320.lib inside the install\x64\vc14\lib folder and, either include the install\x64\vc14\bin directory on your system path or copy opencv_world320.dll into the same directory as your program. For examples on how to use the CUDA routines including which headers are required, I would inspect the gpu samples for the functions you require https://github.com/opencv/opencv/tree/master/samples/gpu.

  11. your pic cuts off before showing what to put for the MKL LAPACK libraries… I have no idea what to put there.. there is no HAVE_MKL button either ? it complains about OpenBlas but I want to use MKL !

    Could not find OpenBLAS lib. Turning OpenBLAS_FOUND off
    A library with BLAS API not found. Please specify library location.
    LAPACK requires BLAS

    1. Hi Gary,
      Are you using OpenCV v3.2.0 commit 70bbf17b133496bd7d54d034b0f94bd869e0e810?
      Have you installed Intel MKL, because the location should be picked up automatically?

  12. Hi, first i must thank you because you helped a lot in my project, and saved me a lot of time.
    But i’m having problems in running the compiled program (.exe) in another computer, when i run the program it crashes, and in the windows event viewer it says the faulty module: opencv_world320.
    Any ideias why? it runs well on the computer i compiled it with

    1. Hi, I cannot be sure, without more information. Have you copied across the opencv_world320.dll to your other computer?

      1. Both PC’s are windows 10 with CUDA, MKL and TBB installed, i made an .exe in vs2015, and build it in released mode, if i copy without the .dll it says he need it, but even if i copy to the other computer it crashes and gives an 0x000142 error, this also happens with the executables from the \bin folder from opencv

          1. Sorry for the response delay, was trying to solve it without nagging too much.
            Dependency walker trows some false errors with the API-MS-WIN-CORE-XXX.DLL calls, and after profiling it doesn’t say anything about the error.

            I have vs2015 redistributable so i think that’s not the problem, is there any requirements to run your build?

          2. Hi, I have tried opencv_version.exe on a windows 8 and 10 machine without any problems, all I required was the CUDA shared libs (cudart64_80.dll, cufft64_80.dll, nppc64_80.dll, nppi64_80.dll, npps64_80.dll, cublas64_80.dll), opencv_world320.dll and VS2015 redistributable. Which executable from the pre-built binaries are you running?

            I noticed that windows explicitly warns you about which dll is missing, it is therefore unlikely you would be getting a 0x000142 error if you were missing one of the above.

    1. Hi, this is very odd because the debug build opencv_versiond.exe is dynamically linked to the VS2015 C\C++ debug runtime (which is not part of the VS2015 redistributable package) and should not run on a PC without VS2015 installed (unless you have manually added the MSVCP140D.dll, or it came with another application, in which case this may be the source of your problems), you should get an error related to MSVCP140D.dll missing when you try to run it, but you do not. Did you download and install VS2015 runtime yourself or did it come with another application? If so it may be an older version.

      Because the error only occurs on release builds, you could try remote debugging of a release build with debug symbols.

      First I would try the opencv_version executable from the official build here on both machines, to see if the error is related to the CUDA build or a configuration issue with your second machine. If that doesn’t work you could try installing VS2015 on the second machine and running your program through the debugger.

        1. Hi, when i build it in debug mode it builds fine, but in release i get this error:
          Unhandled exception at 0x00007FFDB4857588 (opencv_world320.dll) in CudaDetection.exe: 0xC000001D: Illegal Instruction.

          I’m only getting errors with the CUDA version, the normal one works fine.

          1. Hi,

            Are you now building on the other machine with Visual Studio 2015 and running through the Visual Studio debugger, or remote debugging? Can you hit a debug point placed at the beginning of main before the exception is thrown, do the locations of the loaded modules, for cuda and opencv and visual studio runtime, shown in the modules window look correct?

            When/how did you install the Visual Studio 2015 debug redistributable?

Leave a Reply

Your email address will not be published. Required fields are marked *