Software 42861 Published by

A new release of the FEX-Emu is now available, featuring support for the WINE WoW64 and Arm64ec package, partial support for inline self-modifying code, as well as enhancements to JIT bug fixes and performance. The package includes DLL files for x86 and x86-64 emulation within WINE, thereby minimizing CPU overhead. The update incorporates support for inline self-modifying code and the trap flag, utilized by certain anti-tamper and anti-debugger software.

The recent updates for JIT encompass bug fixes and performance enhancements, addressing issues such as incorrect backpatching of unaligned atomics, improper instruction handling, and optimizing the performance of several instructions. The updates encompass a resolution for float to integer overflow behavior, an adjustment for ModRM decoding of 3DNow! instructions, and a correction for H0F3A table decoding.



FEX-2501

Welcome back to another new FEX-Emu release in the new year! While everyone was out celebrating the holidays, we still managed to get some work done. So let's get in to what we did this last month!

Official WINE WoW64 and Arm64ec package support

This month we have updated our Ubuntu ppa repository to now support a fex-emu-wine package. This package provides wow64 and arm64ec emulator DLL files that can be applied directly to an AArch64 build of WINE, thus allowing you to do x86 and x86-64 emulation inside of WINE directly and removing a ton of CPU overhead in the emulation! This is relatively fresh so there will be some teething issues around getting it setup, like the current upstream WINE may not integrate directly in to these builds yet. Check out our  wiki for more information about getting this hooked up.

Partial support for inline self-modifying code and trap flag

As we work towards supporting more edge-case behaviour of anti-tamper and anti-debugger software. We have spent some time this month implementing support for inline self-modifying code and the trap flag. In particular Denuvo uses inline self-modifying code which is relatively annoying to support, but we can use the fact that it tends to generate invalid instructions to determine that a block of code is invalid early, thus letting it work. There's some more work towards making this more robust but this gets a decent number of games running.

The trap flag on the other hand is interesting because this is an anti-debugger tactic that some badly behaving launchers use. This is because of how debuggers treat the trap flag versus how it works when a debugger isn't running, this lets the application detect the debugger and throw an error. FEX didn't quite handle this correctly which was causing these launchers to throw their hands up and stop running.

A note is that some of this work is only wired up on the WINE side rather than the FEX-Emu Linux emulation side, so mileage may vary!

JIT bug fixes and performance improvements

As usual, a lot of fixes landed for our JIT, ranging from incorrect backpatching of unaligned atomics, to incorrect instruction handling, to improving performance of a couple of instructions. Let's break down what we fixed this month.

Fixed backpatching of unaligned atomics with small immediates

ARM's FEAT_LRCPC2 extension added TSO instructions for small immediate offsets in the range of -256 to 255. These still have the regular atomic limitation of ARM where the address needs to be naturally aligned (or within 16 byte granule!) of the access type. FEX needs to emulate unaligned memory accesses from x86 by backpatching these instructions to be a DMB plus load or store. We were incorrectly patching these instructions with the small offsets. This will improve stability of emulation on hardware that supports the new FEAT_LRCPC2 instructions

Fix float to integer overflow behaviour

This is a very important change for how FEX handles when converting a float value to an integer and an overflow occurs. While we knew of the problem, we didn't realize how wide reaching the problem was causing problems. In particular this fixes The Talos Principle's audio cuting out, Animal Well's music having chirping artifacts, SOMA not allowing interactions with things in the world, Satisfactory's server crashing, and Metaphor Refantazio infinite looping before getting in-game!

There are sure to be a bunch of other little fixes that this also fixes because it's a pervasive problem that games rely upon!

Fix ModRM decoding of 3DNow! instructions

While 3DNow! isn't used in any recent games, to the point that AMD has removed the instruction set from Zen CPU cores, older games still use this extension if possible. Turns out we had a gap in our testing infrastructure for when a 3DNow! instruction used the SIB encoding form of the instruction. This would result in crashes and misinterpreting of instructions. This will fix some older 32-bit games using 3DNow! and of course we added new unittests to our testing infrastructure to make sure it keeps working.

Fixes H0F3A table decoding

This fix doesn't affect any known applications, but because of how x86 compilers aggressively pad instruction sizes, this could crop up anywhere without us noticing. When the H0F3A instruction table gets decoded, FEX was incorrectly applying the REX_W prefix to instructions that would ignore the prefix. Out of all the instructions in the table, only three actually care about the prefix while the others always ignore it. If this padding occured then FEX would think it is an unknown instruction and crash. This has now been resolved which should keep us from ever hitting the issue.

Generate 80-bit SVE loadstores when necessary

For all the users that have SVE supporting hardware (There aren't a lot of you!), we have added a new optimization that converts two loads or stores in to a single 80-bit masked loadstore instruction. While this isn't going to be a huge improvement because this only occurs with x87 code, it's another little optimization in the list of things that SVE improves for x86 emulation.

Increase minimum kernel requirement from 5.0 to 5.15

We're moving in to the future with some changes that require increasing our minimum kernel version. Because we were allowing such an old version of the Linux kernel, we were hitting some heartburn in some codepaths. In order to make this easier, we are moving up the minimum kernel requirement to an LTS release of the kernel released back in 2021 already! We don't expect this to cause too many problems, since this is an kernel supported by Ubuntu for 22.04

Drop official support for ArchLinux

Due to a clarification from the ArchLinux team this last month, they are no longer allowing packages in the AUR that don't support x86-64. Due to this change and that FEX only supports running on an AArch64 host, they have removed our official packages from AUR. There's nothing that we can do about this besides dropping support for ArchLinux.

Raw Changes

FEX Release FEX-2501

  • ArchHelpers

  • Arm64

  • Fixes LDAPUR and STLUR backpatching ( 1e827ec)

  • ConstProp

  • fix 32-bit masking behaviour ( c902b88)

  • Context

  • Constify GPRs passed to ReconstructCompactedEFLAGS ( a86c922)

  • External

  • Update bundled libfmt ( 7e257cc)

  • FEXCore

  • Emulate EFLAGS.TF ( e88c92d)

  • Override x87 precision control when necessary ( 8111b7c)

  • Don't WaitForEmptyJobQueue if CodeObjectCacheService isn't used ( 5a4691f)

  • FEXLoader

  • Increase minimum kernel requirement from 5.0 to 5.15 ( 6bc7a83)

  • Enable early logs output to stderr ( e32c538)

  • Frontend

  • Fix ModRM handling with 3DNow! ( 15a1a0f)

  • GdbServer

  • Fixes encoding of hex ( 735a4f9)

  • Support 32-bit context definitions ( 072cf4c)

  • Implement support for $vKill ( 46fb858)

  • IR

  • Change convention from number of elements to elementsize ( a6c67ca)

  • Passes

  • Adds missing comment that clang-format keeps complaining about locally ( b03b02d)

  • InstCountCI

  • Adds more LRCPC2 tests that are missed ( cd6722f)

  • Implement support for TSO and LRCPC and add hot block that could be optimized ( 9fb69ed)

  • InstructionCountCI

  • add some hot blocks from Factorio ( e44d1f1)

  • Linux

  • Fixes typo in removing RESOLVE_IN_ROOT flag ( e55b5d0)

  • FaultSafeUserMemAccess

  • Break out fault safe handler ( 57178ab)

  • LinuxEmulation

  • Don't use clone3 for fork ( 71187d3)

  • LinuxSyscalls

  • Log unhandled clone3 fork flags ( c3261b4)

  • Ensure CSIGNAL is merged back in to flags for clone2 ( c7fb95a)

  • Fixes exit syscall ( bdae4f6)

  • OpcodeDispatcher

  • Fixes FEX's H0F3A table handling of REX.W ( 90b1ac4)

  • Minor division improvement ( 04e785e)

  • ThreadManager

  • Add some sanity asserts ( d8ef702)

  • Threads

  • Fix memory leak in joinable() ( f906c6a)

  • Thunks

  • gen

  • Add support for compiling against clang 19 ( 7b2fc37)

  • Utils

  • FileLoading

  • Fix LoadFileImpl ( 527752c)

  • Windows

  • Only deinit the thread CRT when destroying the current thread ( d2bac45)

  • Track RWX regions in mapped images ( 27ededf)

  • Misc

  • Just a few things picked up from static analysis ( 8913c59)

  • Support a merged RootFS (and a bunch of related fixes) ( 2d66bc2)

  • Fix float->int conversion overflow behaviour ( d2f86e4)

  • Library Forwarding: Allow reading standard library headers from a development x86 rootfs ( d66cd16)

  • Support inline self modifying code ( 656477e)

  • Revert  #4118 ( 7472b21)

  • Generate SVE for 80bit load/stores when possible ( 8427731)

  • docs

  • Remove Arch from the release process. ( f8b6edf)

  • unittests

  • Adds a 3DNow! ModRM SIB encoding test ( 3abe6c1)

  • ASM

  • Fix incorrect instruction form test ( b391fe6)

  • Adds missing MMX PADDQ test ( fc1b500)

  • gvisor

  • Disable memfd tests ( 8bee101)

  • x87StackOptimizationPass

  • Minor opt to f80 fchs and fabs ( f51812a)

Release FEX-2501 · FEX-Emu/FEX