WebGL is a JavaScript framework used to render 2D and 3D
graphics in your browser. Released in 2011, it has become the
backbone of almost all graphical rendering on the web. Most
major web-based rendering frameworks, such as Three.js,
Babylon.js, Box2D, and Cornerstone.js, are built on top of
WebGL. Until the last few years, the phrase "rendering on the
web" was synonymous with WebGL.
WebGL 2.0, which is powered by OpenGL ES 3.0, which was released in 2008. This means that the capabilities of WebGL are rooted in a graphics API that is over 16 years old.
There has been significant development in consumer-level GPUs, starting with the GeForce 200 Series released in 2008. Since then, the series has progressed through the 300, 400, ..., 900, 10 series, 20 series, and now the 40 series. Alongside this, modern graphics APIs like DirectX 12, Vulkan, and Metal have emerged, enabling advanced capabilities in desktop and mobile gaming. However, browser-based games still rely on WebGL, which is based on OpenGL ES 3.0 (released in 2008). This means web-based rendering is significantly behind compared to desktop and mobile rendering. As a result, web browsers face performance and frame rate limitations, and many advanced features of modern GPUs remain inaccessible due to WebGL's reliance on outdated technology.
WebGPU is the successor of WebGL. It's development started in
2017. It enabled us to use the latest graphics cards features.
WebGPU is supported on Chrome, Edge, Firefox, Safari etc.
WebGPU supports multi-threading which is helpful while
downloading the model without using main thread.
WebGPU is powered by graphics APIs such as DirectX 12, Vulkan
and Metal based on the browser platform.
ML frameworks such as Tensorflow.js and Transformers.js are using WebGPU under the hood. If you want to do any kind of computations on the tensorflow, tensorflow has to convert the input tensor into textual or vertex data and again back to tensor data. WebGPU supports Compute Shader. Compute Shaders are General Purpose Programs that you can run on the GPUs. When tensorflow is directly able to use the computer shaders there is no need to convert the input data into vertex or textures. You can directly perform the computation on the shaders. Because of the less overhead it enabled the developers to run encoder-decoder ML models such as SAM2, Whisper, llama, gemma on the client side.