CVE-2022-49394

In the Linux kernel, the following vulnerability has been resolved: blk-iolatency: Fix inflight count imbalances and IO hangs on offline iolatency needs to track the number of inflight IOs per cgroup. As this tracking can be expensive, it is disabled when no cgroup has iolatency configured for the device. To ensure that the inflight counters stay balanced, iolatency_set_limit() freezes the request_queue while manipulating the enabled counter, which ensures that no IO is in flight and thus all counters are zero. Unfortunately, iolatency_set_limit() isn't the only place where the enabled counter is manipulated. iolatency_pd_offline() can also dec the counter and trigger disabling. As this disabling happens without freezing the q, this can easily happen while some IOs are in flight and thus leak the counts. This can be easily demonstrated by turning on iolatency on an one empty cgroup while IOs are in flight in other cgroups and then removing the cgroup. Note that iolatency shouldn't have been enabled elsewhere in the system to ensure that removing the cgroup disables iolatency for the whole device. The following keeps flipping on and off iolatency on sda: echo +io > /sys/fs/cgroup/cgroup.subtree_control while true; do mkdir -p /sys/fs/cgroup/test echo '8:0 target=100000' > /sys/fs/cgroup/test/io.latency sleep 1 rmdir /sys/fs/cgroup/test sleep 1 done and there's concurrent fio generating direct rand reads: fio --name test --filename=/dev/sda --direct=1 --rw=randread \ --runtime=600 --time_based --iodepth=256 --numjobs=4 --bs=4k while monitoring with the following drgn script: while True: for css in css_for_each_descendant_pre(prog['blkcg_root'].css.address_of_()): for pos in hlist_for_each(container_of(css, 'struct blkcg', 'css').blkg_list): blkg = container_of(pos, 'struct blkcg_gq', 'blkcg_node') pd = blkg.pd[prog['blkcg_policy_iolatency'].plid] if pd.value_() == 0: continue iolat = container_of(pd, 'struct iolatency_grp', 'pd') inflight = iolat.rq_wait.inflight.counter.value_() if inflight: print(f'inflight={inflight} {disk_name(blkg.q.disk).decode("utf-8")} ' f'{cgroup_path(css.cgroup).decode("utf-8")}') time.sleep(1) The monitoring output looks like the following: inflight=1 sda /user.slice inflight=1 sda /user.slice ... inflight=14 sda /user.slice inflight=13 sda /user.slice inflight=17 sda /user.slice inflight=15 sda /user.slice inflight=18 sda /user.slice inflight=17 sda /user.slice inflight=20 sda /user.slice inflight=19 sda /user.slice <- fio stopped, inflight stuck at 19 inflight=19 sda /user.slice inflight=19 sda /user.slice If a cgroup with stuck inflight ends up getting throttled, the throttled IOs will never get issued as there's no completion event to wake it up leading to an indefinite hang. This patch fixes the bug by unifying enable handling into a work item which is automatically kicked off from iolatency_set_min_lat_nsec() which is called from both iolatency_set_limit() and iolatency_pd_offline() paths. Punting to a work item is necessary as iolatency_pd_offline() is called under spinlocks while freezing a request_queue requires a sleepable context. This also simplifies the code reducing LOC sans the comments and avoids the unnecessary freezes which were happening whenever a cgroup's latency target is newly set or cleared.
Configurations

Configuration 1 (hide)

OR cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*
cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*
cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*
cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*
cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*
cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*
cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*
cpe:2.3:o:linux:linux_kernel:5.0:-:*:*:*:*:*:*
cpe:2.3:o:linux:linux_kernel:5.0:rc6:*:*:*:*:*:*
cpe:2.3:o:linux:linux_kernel:5.0:rc7:*:*:*:*:*:*
cpe:2.3:o:linux:linux_kernel:5.0:rc8:*:*:*:*:*:*

History

21 Oct 2025, 12:15

Type Values Removed Values Added
CVSS v2 : unknown
v3 : unknown
v2 : unknown
v3 : 5.5
Summary
  • (es) En el kernel de Linux, se ha resuelto la siguiente vulnerabilidad: blk-iolatency: corrige los desequilibrios de recuento en vuelo y los bloqueos de IO en modo sin conexión iolatency necesita rastrear la cantidad de IO en vuelo por cgroup. Como este seguimiento puede ser costoso, se deshabilita cuando ningún cgroup tiene iolatency configurado para el dispositivo. Para garantizar que los contadores en vuelo se mantengan equilibrados, iolatency_set_limit() congela la request_queue mientras manipula el contador habilitado, lo que garantiza que no haya IO en vuelo y, por lo tanto, todos los contadores sean cero. Desafortunadamente, iolatency_set_limit() no es el único lugar donde se manipula el contador habilitado. iolatency_pd_offline() también puede dec el contador y activar la desactivación. Como esta desactivación ocurre sin congelar el q, esto puede suceder fácilmente mientras algunas IO están en vuelo y, por lo tanto, filtrar los recuentos. Esto se puede demostrar fácilmente activando iolatency en un cgroup vacío mientras los IO están en tránsito en otros cgroups y luego eliminando el cgroup. Tenga en cuenta que iolatency no debería haberse habilitado en ninguna otra parte del sistema para garantizar que la eliminación del cgroup deshabilite iolatency para todo el dispositivo. Lo siguiente sigue activando y desactivando iolatency on sda: echo +io &gt; /sys/fs/cgroup/cgroup.subtree_control while true; do mkdir -p /sys/fs/cgroup/test echo '8:0 target=100000' &gt; /sys/fs/cgroup/test/io.latency sleep 1 rmdir /sys/fs/cgroup/test sleep 1 done and there's concurrent fio generating direct rand reads: fio --name test --filename=/dev/sda --direct=1 --rw=randread \ --runtime=600 --time_based --iodepth=256 --numjobs=4 --bs=4k while monitoring with the following drgn script: while True: for css in css_for_each_descendant_pre(prog['blkcg_root'].css.address_of_()): for pos in hlist_for_each(container_of(css, 'struct blkcg', 'css').blkg_list): blkg = container_of(pos, 'struct blkcg_gq', 'blkcg_node') pd = blkg.pd[prog['blkcg_policy_iolatency'].plid] if pd.value_() == 0: continue iolat = container_of(pd, 'struct iolatency_grp', 'pd') inflight = iolat.rq_wait.inflight.counter.value_() if inflight: print(f'inflight={inflight} {disk_name(blkg.q.disk).decode("utf-8")} ' f'{cgroup_path(css.cgroup).decode("utf-8")}') time.sleep(1) The monitoring output looks like the following: inflight=1 sda /user.slice inflight=1 sda /user.slice ... inflight=14 sda /user.slice inflight=13 sda /user.slice inflight=17 sda /user.slice inflight=15 sda /user.slice inflight=18 sda /user.slice inflight=17 sda /user.slice inflight=20 sda /user.slice inflight=19 sda /user.slice &lt;- fio stopped, inflight stuck at 19 inflight=19 sda /user.slice inflight=19 sda /user.slice Si un cgroup con inflight atascado termina siendo limitado, las IO limitadas nunca se emitirán ya que no hay un evento de finalización para despertarlo, lo que genera un bloqueo indefinido. Este parche corrige el error al unificar la gestión de habilitación en un elemento de trabajo que se inicia automáticamente desde iolatency_set_min_lat_nsec(), que se llama desde las rutas iolatency_set_limit() y iolatency_pd_offline(). Es necesario apuntar a un elemento de trabajo ya que iolatency_pd_offline() se llama bajo bloqueos de giro, mientras que congelar una cola de solicitudes requiere un contexto que se pueda suspender. Esto también simplifica el código, lo que reduce el LOC sin los comentarios y evita los bloqueos innecesarios que ocurrían cada vez que se configuraba o borraba el objetivo de latencia de un cgroup.
First Time Linux
Linux linux Kernel
CWE NVD-CWE-noinfo
CPE cpe:2.3:o:linux:linux_kernel:5.0:-:*:*:*:*:*:*
cpe:2.3:o:linux:linux_kernel:5.0:rc7:*:*:*:*:*:*
cpe:2.3:o:linux:linux_kernel:5.0:rc6:*:*:*:*:*:*
cpe:2.3:o:linux:linux_kernel:5.0:rc8:*:*:*:*:*:*
cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*
References () https://git.kernel.org/stable/c/515d077ee3085ae343b6bea7fd031f9906645f38 - () https://git.kernel.org/stable/c/515d077ee3085ae343b6bea7fd031f9906645f38 - Patch
References () https://git.kernel.org/stable/c/5b0ff3ebbef791341695b718f8d2870869cf1d01 - () https://git.kernel.org/stable/c/5b0ff3ebbef791341695b718f8d2870869cf1d01 - Patch
References () https://git.kernel.org/stable/c/77692c02e1517c54f2fd0535f41aa4286ac9f140 - () https://git.kernel.org/stable/c/77692c02e1517c54f2fd0535f41aa4286ac9f140 - Patch
References () https://git.kernel.org/stable/c/8a177a36da6c54c98b8685d4f914cb3637d53c0d - () https://git.kernel.org/stable/c/8a177a36da6c54c98b8685d4f914cb3637d53c0d - Patch
References () https://git.kernel.org/stable/c/968f7a239c590454ffba79c126fbe0e963a0ba78 - () https://git.kernel.org/stable/c/968f7a239c590454ffba79c126fbe0e963a0ba78 - Patch
References () https://git.kernel.org/stable/c/a30acbb5dfb7bcc813ad6a18ca31011ac44e5547 - () https://git.kernel.org/stable/c/a30acbb5dfb7bcc813ad6a18ca31011ac44e5547 - Patch
References () https://git.kernel.org/stable/c/d19fa8f252000d141f9199ca32959c50314e1f05 - () https://git.kernel.org/stable/c/d19fa8f252000d141f9199ca32959c50314e1f05 - Patch

26 Feb 2025, 07:01

Type Values Removed Values Added
New CVE

Information

Published : 2025-02-26 07:01

Updated : 2025-10-21 12:15


NVD link : CVE-2022-49394

Mitre link : CVE-2022-49394

CVE.ORG link : CVE-2022-49394


JSON object : View

Products Affected

linux

  • linux_kernel