Generative Drinker: An Idea for Improving Wine Compatibility

About 17% of Windows users are locked into proprietary file formats, which makes it impossible for them to switch to Linux unless - of course - Wine becomes perfectly compatible with the proprietary software that they need.

In this blog post, I would like to outline my idea for how to make Office, Photoshop, and AutoCAD/Fusion work flawlessly in Wine. As usual, it starts with poking around in the dark (i.e. a well-designed test suite) and a catchy name. I present to you:

The Generative Drinker

When training image generation AI’s, GAN is a crucial step for improving realism and suppressing artifacts. GAN here stands for “Generative Adversarial Networks”. Most drinkers also have a somewhat adversarial relationship with their booze, and the Wine project is called, well, Wine. That’s why I think the process of an adversarial AI fighting against Wine is nicely described as a “Generative Drinker”.

How does GAN work?

GAN training consists of 2 components:

The productive AI model tries to generate images that look as real as possible.
The adversarial AI model tries to accurately predict which images are real photos and which are AI-generated.

In the beginning, when the productive AI is drawing 6-fingered hands or including obvious artifacts, the adversarial AI model quickly learns to spot these differences between real photos and gen-AI images. The decision function of the adversarial AI model is then mathematically inverted and that is used as a loss to force the productive AI model to avoid any patch of pixels, that the adversarial AI model would flag as a high likelihood of being fake.

So the adversarial AI forces the productive AI to become better. Once it is, the adversarial AI is then updated to find whatever mistakes the productive AI is now making. And, thus, the cycle repeats.

How does the Generative Drinker work?

When applied to software compatibility, then the productive AI in the GAN model would be the Wine software suite and the adversarial AI model is akin to the compatibility test suite.

In short, I am suggesting to use random AI-like exploration for automatically generating little Windows test programs that pass on Windows but fail on current Wine versions.

A Detour to iTunes

I like using iTunes to organize my music. I found it annoying, that I could not listen to my iTunes library directly on Linux. So I fixed it.

Most Wine compatibility issues nowadays are due to Windows apps relying on undocumented behavior. Wine has become very good at correctly adhering to the public documentation of almost all Windows APIs.

In the case of iTunes, they use BindDC(DCRenderTarget, hdc) to create a ID2D1DCRenderTarget for drawing into an HDC. They then draw into the HDC and expect those changes to magically appear inside the DXGI target buffer of the D2D1DC Render Target. On real Windows, this works out OK because there is an undocumented behavior that, apparently, causes all HDC’s to be created with a hardware-backed DXGI buffer so that when the user calls BindDC, that just returns the internal (shared) ID2D1DCRenderTarget.

In short, Windows re-uses objects under the hood, and iTunes was relying on that undocumented implementation detail.

In case you’re curious, here’s the documentation I wrote when submitting my patch to Wine.

And back to Detours

Detours is a Windows library for hooking and redirecting DLL calls. I’ve used it before for sending the GUI draw commands of Japanese video games into a language server which then allowed me to replace/rewrite/translate text on the fly and inject the English result back into the GUI’s DirectWrite commands.

But here, we are searching for smart people call behavioral invariants. In short, we want to find cases where the Windows API produces unexpectedly consistent return values, for example when objects are shared on under the hood or when return values are cached.

In the iTunes case, I just made a lucky guess. But using Detours to log the results of HDC-related function calls while iTunes is running on real Windows, one could also have spotted the same undocumented behavior. And if we can execute arbitrary code inside the process space of a running Windows app, then we can also do a targeted verification of behavioral invariants, for example by simply calling BindDC twice where the app calls it just once and observing the identical return pointer, thereby all but proving that it is re-used internally.

In Vino Veritas

Now let’s tie all of this together:

Using Detours, we can log how Windows APIs behave inside problematic proprietary software while it is running on real Windows.
Using AI, we can analyze those logs and create an unwieldy long list of undocumented behavior hypotheses.
We then spawn a fleet of AI agents to turn each hypothesis into C source code for a regular Windows app that tries to tell the difference between real Windows and Wine by checking for specific undocumented API side-effects.
We run the test apps under Windows and under Wine and compare the results.

=> We get a list of undocumented behavior that a specific proprietary software relies on, but that is not emulated in Wine (yet).

Practical Applications

Let’s see the Generative Drinker defeat the big three Linux-slayers:

1. Microsoft Office

Office is a COM monster. Office secretly depends on certain COM objects being implemented as singletons, so that repeated calls to CoCreateInstance return the exact same pointer.

2. Photoshop

Photoshop’s rendering pipeline is a maze of GDI, Direct2D, and DirectWrite calls. If Photoshop shares device contexts internally the way iTunes does, the Generative Drinker would find it by logging CreateCompatibleDC and BindDC calls and noticing suspiciously consistent return values.

3. AutoCAD and Fusion 360

These obviously rely on undocumented behavior because how well they work varies widely between Wine versions.

Property window mouse clicks get ignored when using Wine’s comctl32, but work fine if you copy over native Windows DLLs. The current working theory that it’s a synchronization error, like WM_LBUTTONDOWN messages arriving in a slightly different order inside Wine. That’s exactly the sort of subtile difference that the Generative Drinker is designed to find.

Summary

Wine implements the API correctly according to documentation. But the applications rely on undocumented behavior. Which, purely by coincidence, the Wine implementation does not share.

The Generative Drinker turns this from a guessing game into a systematic search. Even before anyone complains, we proactively hypothesize “Photoshop assumes GetDC returns cached handles” (for example) and add it to the test suite.

I hope I didn’t go overboard with the Wine and drinking jokes. This is meant to be a real suggestion on how to improve the Wine test suite. I hope that one day I’ll have some time to turn this idea into reality. Because I’d really like to have Photoshop, Lightroom Classic, and Fusion 360 running on my Pop!OS Workstation. Those were already some of the last remaining wishes on my list 2 years ago.

I tried to see if anyone has done something similar. The DiffSpec paper describes a similar idea, but without AI, without automatic ground truth log generation, and without providing any code. And while Mokav has code on GitHub, they require the user to manually execute tests, so they are lacking the Generative Drinker’s hooking and cross-platform (Windows and Wine) test execution capabilities, which means Mokav cannot run unattended.

If you have any good idea on how to turn the Adobe suite into a first-class Linux citizen, or if you know of a similar project that I missed, please comment on HN.