In the Linux kernel, a vulnerability has been identified and resolved that involves the af_unix garbage collector racing against the connect() function. The garbage collector does not take into account the risk of an embryo being enqueued during garbage collection, which could lead to issues such as dangling pointers and an incorrectly elevated inflight count.

In this post, we will analyze the issue and examine the code snippet where the vulnerability occurs, describe the exploit in detail, and provide original references for further reading.

The code snippet below demonstrates the point of vulnerability in the Linux kernel

connect(S, addr)	sendmsg(S, [V]); close(V)	__unix_gc()
NS = unix_create1()
skb1 = sock_wmalloc(NS)
L = unix_find_other(addr)
unix_state_lock(L)
unix_peer(S) = NS
			// V count=1 inflight=

 			NS = unix_peer(S)
 			skb2 = sock_alloc()
			skb_queue_tail(NS, skb2[V])

			// V became in-flight
			// V count=2 inflight=1

			close(V)

			// V count=1 infl-ight=1
			// GC candidate condition met

						for u in gc_inflight_list:
						  if (total_refs == inflight_refs)
						    add u to gc_candidates

						// gc_candidates={L, V}

						for u in gc_candidates:
						  scan_children(u, dec_inflight)

						// embryo (skb1) was not
						// reachable from L yet, so V's
						// inflight remains unchanged
__skb_queue_tail(L, skb1)
unix_state_unlock(L)
						for u in gc_candidates:
						  if (u.inflight)
						    scan_children(u, inc_inflight_move_tail)

						// V count=1 inflight=2 (!)

The code illustrates the following steps

1. The unconnected socket (S) in AF_UNIX/SOCK_STREAM attempts to connect to the listening socket (L) bound to addr.

A message is sent and V's file descriptor is passed, causing the inflight count to be bumped.

3. The garbage collector (__unix_gc()) scans the list of inflight sockets and evaluates the conditions for GC candidates.

The GC candidate socket V is inspected, and its inflight count remains unchanged.

5. At the end of the process, the vulnerability is manifested through the incorrect inflight count of V, resulting in issues such as dangling pointers and an incorrectly elevated inflight count.

Exploit Detail

The exploit revolves around the garbage collector racing against the connect() function. The main issue arises when the garbage collector doesn't consider the risk of an embryo being enqueued during garbage collection. This results in two consecutive passes of scan_children(), which may see a different set of children, leading to the issues stated previously.

Fix Implementation

A fix for this vulnerability has been implemented in the Linux kernel, addressing the racing condition between the garbage collector and the connect() function. By adding a GC-candidate listening socket lock/unlock mechanism, the garbage collector will wait until the end of any ongoing connect() to that socket before proceeding. This ensures that any SCM-laden embryo is already enqueued and prevents further conflicts between the connect() function and the garbage collector.

Further information on the vulnerability can be found in the following original references

1. Linux Kernel Mailing List (LKML) Patch Submission
2. Linux Kernel Documentation

By understanding this vulnerability and its fix implementation, developers can ensure that their Linux kernel-based systems are safeguarded against potential exploits that arise from this racing condition. The code snippet, exploit details, and original references provided aim to assist in this understanding and promote the secure development of Linux kernel-based projects.

Timeline

Published on: 04/25/2024 06:15:57 UTC
Last modified on: 06/27/2024 12:15:22 UTC