Build/Compile OpenCV v3.2 on Windows with CUDA 8.0 and Intel MKL+TBB

Posted on Posted in CUDA, OpenCV


Because the pre-built Windows libraries available for OpenCV v3.2 do not include the CUDA modules, I have provided them for download here, and included the build instructions below for anyone who is interested.

The guide below details instructions on compiling the 64 bit version of OpenCV v3.2 shared libraries with Visual Studio 2013 (will also work with Visual Studio 2015 if selected in CMake), CUDA 8.0, and support for both the Intel Math Kernel Libraries (MKL) and Intel Threaded Building Blocks (TBB). The guide is split into two parts, the first to follow if you just want to compile OpenCV v3.2 with CUDA support and the second if you want to include the Intel performance libraries, SIMD optimizations and Eigen integration.

Note: The procedure outlined will not work for Visual Studio 2017 because this is not supported by the CUDA 8.0 Toolkit.

For Visual Studio 2013 and 2015 you first need to:

  • Download the source files, available on GitHub. Either clone the git repo making sure to checkout the 3.2.0 tag or download this archive containing all the source file.
  • Install CMake – Version 3.7.1 is used in the guide.
  • Install The CUDA 8.0 Toolkit.
  • Optional – Install both the Intel MKL and TBB by registering for community licensing, and downloading for free.
  • Optional – Download and extract the Eigen C++ template library for linear algebra.


Building OpenCV v3.2 with CUDA 8.0
  1. Fire up CMake, making sure that the Grouped checkbox is ticked, select the location of the source files downloaded from GitHub and the location where the build will take place.

  2. Click the Configure button and select Visual Studio 2013 Win64 (32 bit CUDA support is limited). This may take a while as CMake will download ffmpeg and the Intel Integrated Performance Primitives for Image processing and Computer Vision (IPP-ICV).

  3. Expand the BUILD group, untick BUILD_DOCS (requires additional dependencies, and can be downloaded from here) and tick BUILD_opencv_world (builds to a single dll). Your only unticked build options should be the ones shown in the image below.

  4. Expand the CUDA tab, the CUDA_TOOLKIT_ROOT_DIR should point to your CUDA 8.0 toolkit installation, if you have more than one version of the toolkit installed and it has picked that one then simply change the path to point to CUDA 8.0.

    The default CUDA_ARCH_BIN option is to build microcode for all architectures from 2.0-6.1 (FermiPascal). This setting results in a large build time (~3.5hours on an i7) but the binaries produced will run on all supported devices. If you only want to execute OpenCV on a specific device then only enter the compute capability of that device here, remember that this the produced libraries are not guaranteed to run on any device’s of a different major compute version to the ones entered, see the CUDA C Programming Guide for details.

    If you are comfortable with the implications, you can also enable CUDA_FAST_MATH which will enable the –use_fast_math compiler option, again see CUDA C Programming Guide for details.


  5. Expand WITH and enable WITH_CUBLAS to enable the CUDA Basic Linear Algebra Subroutines (cuBLAS).

  6. If you want to include the Intel performance libraries, SIMD optimizations and Eigen integration, go to Including Intel MKL, TBB and other optional components, before proceeding.
  7. Press Configure again, ensure that there are no warning messages in red in the configuration window. If there are then the Visual Studio solution may be generated but it it will probably fail to build.
  8. Press Generate and wait until the bottom of the window indicates success.

  9. Press Open Project (not available in older versions of CMake, for those just locate and open the Visual Studio solution file) to open up the solution in Visual Studio.

  10. Click Solution Explorer, expand CMakeTargets, right click on INSTALL and click Build. This will both build the library and copy the necessary redistributable parts to the install directory, D:/OpenCV/build/install in this example. If everything was successful, congratulations, you now have OpenCV v3.2 built with CUDA 8.0.

  11. NOTE: If you change remove any options after pressing Configure a second time, the build may fail, it is best to remove build directory and start again. This may seem over cautions but it is preferable to waiting for an hour for the build to fail and then starting again.


Including Intel MKL, TBB and other optional components
  1. If MKL and TBB are installed correctly the path to these should have been picked up in CMake (see below), all that is required is to enable TBB by ticking MKL_WITH_TBB.

    and enable MKL by ticking HAVE_MKL.

  2. To build with SIMD optimizations enable the options shown below.

  3. To build with Eigen set EIGEN_INCLUDE_PATH to the directory you extracted the archive to.

  4. Press Configure and verify that the optional components have been selected, then proceed as above.

21 thoughts on “Build/Compile OpenCV v3.2 on Windows with CUDA 8.0 and Intel MKL+TBB

  1. Thank you for this detailed explanation. I switched from opencv 2.x to opencv 3.2 recently. I didn’t realise that the pre-built opencv 3.2 libary doesn’t include CUDA part of the code when I was downloading it, so I have build it by myself. This article has saved my life and time!

  2. Hello.
    Can you give me a OpenCV v3.2 with CUDA 8.0 compiled for Windows x86 (with the limitations, don’t matter), also if this is posible, compiled without CUDA for x86 (because official opencv distro only include binaries without CUDA for x64).
    I need a 32bits version of my app that use OpenCV, and it have the option to enable/disable CUDA on execution. Thanks a lot. I will appreciate that.

    1. Hi,

      I have uploaded 32 bit versions of OpenCV compiled without CUDA for both VS2013 and VS2015 to the downloads page.

      To compile a 32 bit version of OpenCV with CUDA v8.0, you would need to remove all OpenCV CUDA modules which use the CUDA NPP libraries. From a quick inspection of the output from cmake, this may be all of them. If you really need a 32 bit version of OpenCV with all the CUDA modules then I would suggest trying to compile it with CUDA v6.5, although I am not sure if this will work with OpenCV v3.2.

  3. Just wondering, will all of the above work with VS 2017 x64 compiler?
    So far getting build errors in opencv_core or opencv_world depending on whether opencv_world is enabled, with VS just casually telling me “cmd.exe exited with code 1” without any extra info really.

    1. I don’t think so. Nvidia does not list Visual Studio 2017 as one of its supported compilers for CUDA 8.0.

      I think you will have to wait until the next version of CUDA is released. If you only have VS 2017 you can try to generate a VS 2015 solution and build it in VS 2017 with the VS 2015 tool set, see this blog post. However you would then need to use the VS 2015 tool set for all projects that depend on your OpenCV build.

      1. Well yeah, but then the add-in won’t work. Just had to get myself full VS2015 IDE and use that one for now. Sad that NVIDIA is just a small company and cant afford to keep up with the updates every two months.

        Ty for the guide above. Regards.

  4. Hi
    Thanks for descriptive tutorials. But build does not contain opencv_traincascade executable and other executable which are helpful to training. Can you provide for opencv 3.2 , visual studio 2015 and cuda 8.0

    1. Hi, I can and probably will include the additional binaries in the future, however they are available as part of the standard OpenCV 3.2 download.

      The binaries from the above link are compiled with VS2015 (although I wouldn’t think it should matter, because they are stand alone exe’s) and as far as I know do not run on the GPU.

        1. Hi, I update the download page to included all the binaries, however I do not think this will help you. I have quickly browsed the code in the opencv_traincascade project and there does not appear to be calls to any GPU accelerated functions, or any cuda source code in the project itself.

  5. Hi, a have a problem. When I am building the INSTALL project after about 30-45 minutes of work VS starting to giving me about 45 errors like “LNK 1104: cannot open file “..\..\lib\opencv_world320d.lib”.

    What can I do with that?

    1. Hi,
      Do you see any errors relating to the build of opencv_world320d.lib failing before the link errors you describe? Can you send me the output from the build window?
      Are you using VS2013 or VS2015, and are you building with CUDA?

      1. First error:
        > C:\Program Files (x86)\Windows Kits\8.1\Include\um\combaseapi.h(229): error : identifier “IUnknown” is undefined
        9>C:\Program Files (x86)\Windows Kits\8.1\Include\um\combaseapi.h(229): warning : expression has no effect
        9>D:\OCV\opencv-3.2.0\modules\core\include\opencv2/core/base.hpp(365): warning : function declared with “noreturn” does return
        9> 1 error detected in the compilation of “C:/Users/tomus/AppData/Local/Temp/tmpxft_000018ac_00000000-6_gftt.cpp4.ii”.
        9>CUSTOMBUILD : nvcc warning : The ‘compute_20’, ‘sm_20’, and ‘sm_21’ architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
        9> CMake Error at (message):
        9> Error generating file
        9> D:/OCV/build2/modules/world/CMakeFiles/cuda_compile.dir/__/cudaimgproc/src/cuda/Debug/

        1. Hi, this looks to be an error which people have experienced on Windows 7 and/or earlier versions of Visual Studio, related to the Windows SDK. You could try adding WIN32_LEAN_AND_MEAN as a global processor in all effected projects as suggested
 but I cannot guarantee it will help.
          Are you compiling on Windows 7?

  6. Thanks a lot!!! I was having problems building for Visual Studio 14 2015 when changed to 64 bits it worked!!

Leave a Reply

Your email address will not be published. Required fields are marked *